DPVS 测试需要的环境比较复杂,按照官方文档 simple fnat 测试一下单机双臂 fnat. 关于安装编绎没啥好说的,按 github 做就可以,但是一定要打开 DEBUG 模式,并且日志级别也为 DEBUG
测试环境
ubuntu 16.04.5
# uname -a
Linux jjh-dpvs-test0 4.4.0-116-generic 140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
lspci -v | grep Eth
02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
两个 I350 网卡用于测试,剩于网卡用于 ssh 暂时不用
ip 分配
┌───────────────────┐ ┌────────────────┐
│ dpvs │ │ │
│ │ │ real server │
│ │ ┌──────▶│10.20.34.24:6379│
│ │ │ │ │
│ │ │ │ │
│ │ │ └────────────────┘
┌─────┴───────┐ ┌─────┴───────┐ │
│ │ │ │ │
│ │ │ │──────┘
┌──────────────┐ │ dpdk1 │ │ │
│ │ │ VIP │ │ dpdk0 │
│ client │ │10.20.101.43:│ │ LIP │
│ 10.34.38.43 ├───────▶│ 6379 │ │10.20.102.41 │
│ │ │ │ │ │
└──────────────┘ │ │ │ │──────┐
│ │ │ │ │
└─────┬───────┘ └─────┬───────┘ │
│ │ │ ┌────────────────┐
│ │ │ │ │
│ │ │ │ real server │
│ │ └──────▶│10.20.74.41:6379│
│ │ │ │
└───────────────────┘ │ │
└────────────────┘
Client IP: 10.34.38.43 测试客户端网卡
DPDK1 VIP: 10.20.101.43 wan 网卡
DPDK0 LIP: 10.20.102.41 lan 网卡
RS1: 10.20.34.24
RS2: 10.20.74.41
配置服务
wan 网卡添加 vip
dpip addr add 10.20.101.43/32 dev dpdk1
添加 wan 默认路由
dpip route add default via 10.20.101.254 dev dpdk1
在 client 机器 ping vip 确保生效
ping 10.20.101.43
PING 10.20.101.43 (10.20.101.43) 56(84) bytes of data.
64 bytes from 10.20.101.43: icmp_seq=1 ttl=58 time=3.66 ms
64 bytes from 10.20.101.43: icmp_seq=2 ttl=58 time=3.52 ms
添加 ipvs service 轮循算法
ipvsadm -A -t 10.20.101.43:6379 -s rr
添加两个 rs
ipvsadm -a -t 10.20.101.43:6379 -r 10.20.34.24:6379 -b
ipvsadm -a -t 10.20.101.43:6379 -r 10.20.74.41:6379 -b
添加 lan lip
ipvsadm --add-laddr -z 10.20.102.41 -t 10.20.101.43:6379 -F dpdk0
添加 dpdk0 默认路由
dpip route add default via 10.20.102.254 dev dpdk0
在 client 机器 ping lip 确保生效
ping 10.20.102.41
PING 10.20.102.41 (10.20.102.41) 56(84) bytes of data.
64 bytes from 10.20.102.41: icmp_seq=1 ttl=58 time=3.52 ms
64 bytes from 10.20.102.41: icmp_seq=2 ttl=58 time=3.43 ms
至少配置完成,这里走了些弯路,由于历史原因交换机配置导致 lip 不通。感谢 sys 组春波同学帮忙。
测试效果
redis-cli -h 10.20.101.43 -p 6379 get a
发现在测试机访问 redis 服务失败,排查看看到底哪里出了问题。
client 机器执行
tcpdump port 6379 -i bond0 -n
rs 两个机器执行
tcpdump port 6379 -i bond0 -n
dpvs 观察日志
tail -f /var/log/dpvs.log
然后再访问 redis 服务
redis-cli -h 10.20.101.43 -p 6379 get a
测试 client 输出
13:32:22.130615 IP 10.34.38.43.37943 > 10.20.101.43.6379: Flags [S], seq 1653003455, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.127957 IP 10.34.38.43.37943 > 10.20.101.43.6379: Flags [S], seq 1653003455, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
连续发了两个 syn 包,也就是说第一次 syn 超时后又重试了一次。
看下 rs 输出
13:32:22.127008 IP 10.20.102.41.1029 > 10.20.34.24.6379: Flags [S], seq 338949052, win 29200, options [exp-9437,mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:22.127035 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.123551 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.124287 IP 10.20.102.41.1029 > 10.20.34.24.6379: Flags [S], seq 338949052, win 29200, options [exp-9437,mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.124304 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:25.123557 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
可以看到 rs 10.20.34.24 己经给 dpvs lip 10.20.102.41 回复 syn+ack 包了,但是没有完成第三次握手。
再来看下 dpvs 日志
IPVS: conn lookup: [6] TCP 10.34.38.43:37943 -> 10.20.101.43:6379 miss
SAPOOL: sa_pool_fetch: 10.20.102.41:1029 fetched!
IPVS: new conn: [6] TCP 10.34.38.43:37943 10.20.101.43:6379 10.20.102.41:1029 10.20.34.24:6379 refs 2
IPVS: state trans: TCP in [S...] 10.34.38.43:37943->10.20.34.24:6379 state NONE->SYN_RECV conn.refcnt 2
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [6] TCP 10.34.38.43:37943 -> 10.20.101.43:6379 hit
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
首先,可以看到从 sa_pool 中正确的获取了本地端口 1029,然后将 syn 包转发到了后端 rs 10.20.34.24, 状态由 NONE 变成了 SYN_RECV
然后 dpvs 接到 rs 的 syn+ack 回包,去查找 session 流表时发现 miss 然后就把包 drop 了。可以看到数据是 cpu [6] 发送的,但是返程数据接收的是 cpu[3]
问题原因
由现象可以得知,是返程数据亲和性问题,通过官方 issue 及文档,得知 我的测试网卡 I350 暂时不支持 flow director, 所以只能用 1 worker 来测试。下周申请万兆网卡测试吧,还得做性能测试。
小感概一下,对于开源软件,如果不懂源码有些问题真是无从下手。
更新20181204
在 sys 组春波和文强的帮助下,换了万兆网卡,simple fullnat 测试通过。下一步做单机的性能测试,最后是 ospf + funat