【原创】zookeeper 使用中的 TCP 链路残留问题

最近公司环境中报了一个 zookeeper 相关问题,问题描述如下: 
业务人员发现xx资源被重复上报了,导致资源池中的内容出了问题,问题的发现是在一次机房断电后,据相关人员说,上报资源的业务也重启(也可能是被强杀)过几次。
      最终看到的结果是,连接到 zookeeper 的 TCP 连接出现了多条,而正常情况下,业务和 zookeeper 之间的 TCP 连接只有一条,同时发现,多出的几条 TCP 连接似乎都是该业务每次重启(也可能是被强杀)后残留下来的。 
      那么问题就来了,为什么系统中仅仅跑了一个业务进程,而一个进程只会建立一条到 zookeeper 的 TCP 链路,但却同时存在多条?退一步讲,即使在一段时间内,存在了多条残留的 TCP 连接,但 zookeeper 也应该能够通过心跳机制,检测到相应连接已经断开了呀(遗憾的时,排查问题的相关人员,在发现问题后没有针对几条疑似残留的 TCP 连接进行抓包分析),为何在观察的几个小时内(观察时间超过了 2 个小时)毫无反应? 

该问题由于不是我主要负责,所以我也就基于自己了解到的信息查询了相关资料,排查过程如下: 

1.确认 zookeeper 心跳相关内容 

客户端侧代码如下
【原创】zookeeper 使用中的 TCP 链路残留问题 

【原创】zookeeper 使用中的 TCP 链路残留问题 

【原创】zookeeper 使用中的 TCP 链路残留问题 

【原创】zookeeper 使用中的 TCP 链路残留问题 
       对一个正常 zookeeper TCP 连接进行抓包,可以看到下图
【原创】zookeeper 使用中的 TCP 链路残留问题 
【原创】zookeeper 使用中的 TCP 链路残留问题 
      从上图中可以看出,在空闲状态下,每隔 3 秒左右,业务(客户端)会主动向 zookeeper 发送心跳请求(ping),zookeeper 也会进行相应的应答回复。 
       另一个发现,当根据包的大小进行排序后(例如长度为 74 的包对应了 ping),可以看到:所有的心跳请求都是由业务(客户端)发送给 zookeeper ,反方向却一个没有~~ 【原创】zookeeper 使用中的 TCP 链路残留问题 
这个行为就有点意思了,假设 zookeeper 不会主动发送心跳请求给客户端,那么如果其没有实现一个连接超时检测机制(或者该机制存在 bug)的话,当客户端侧发生异常,则是由可能导致 TCP 连接残留的。 
      由于 zookeeper 服务端代码我没看过,咨询过相关维护人员,其告诉我 zookeeper 侧是存在这种超时检测机制的,所以理论上讲应该不存在上述问题~~ 

查看了 zookeeper 的配置信息,似乎也没什么特别问题
[root@upu_2 ~]# ps aux|grep zook
root      7853  0.1  5.4 6213632 888956 ?      Sl   Mar10   3:59 /usr/java/jdk1.6.0_45/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,CONSOLE -cp /usr/local/zookeeper/bin/../build/classes:/usr/local/zookeeper/bin/../build/lib/*.jar:/usr/local/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/local/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/local/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/local/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/local/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/bin/../zookeeper-3.4.5.jar:/usr/local/zookeeper/bin/../src/java/lib/*.jar:/usr/local/zookeeper/bin/../conf: org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/local/zookeeper/bin/../conf/zoo.cfg
root     17915  0.0  0.0 103252   844 pts/1    S+   10:11   0:00 grep zook
[root@upu_2 ~]# 
[root@upu_2 ~]# 
[root@upu_2 ~]# cat /usr/local/zookeeper/bin/../conf/zoo.cfg
tickTime=2000
dataDir=/usr/local/zookeeper/data
initLimit=5
syncLimit=2
clientPort=2181
autopurge.snapRetainCount=6
autopurge.purgeInterval=12
server.2=10.23.46.35:2888:3888
[root@upu_2 ~]#

2.怀疑 zookeeper 在 ipv6 地址上进行监听存在缺陷

      后来有大牛在排查时注意到出现问题的 TCP 连接,均属于 IPv6 而并非常见的 IPv4 ,那么是否和这个有关呢? 
查看《 ZooKeeper Administrator's Guide 》,有如下说明:
Configuration Parameters
...
clientPort
    the port to listen for client connections; that is, the port that clients attempt to connect to.
...
clientPortAddress
    New in 3.3.0: the address (ipv4, ipv6 or hostname) to listen for client connections; that is, the address that clients attempt to connect to. This is optional, by default we bind in such a way that any connection to the clientPort for any address/interface/nic on the server will be accepted.
另外,网上可以查到的文章都说 
  • 若不配置 clientPortAddress 的话,则 zookeeper 默认会监听 IPv6 地址(当然,前提是你的机器配置了 IPv6 相关内容);
  • 可以通过在 zookeeper 的启动脚本中配置 "-Djava.net.preferIPv4Stack=true" 以强制其优先使用 IPv4 接收来自客户端的连接(可能存在客户端通过 IPv4 地址连接不上 zookeeper 的 IPv6 监听上的情况);
根据我观察到的情况,结论如下: 
  • 在不配置 clientPortAddress 的情况下,服务器确实监听的是 IPv6 ;
  • 在不强制 zookeeper 优先使用 IPv4 的情况下,其依然可以接收来自 IPv4 的连接;
[root@upu_2 zookeeper]# lsof -i :2181 -P
COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java     7853 root   16u  IPv6    21294      0t0  TCP *:2181 (LISTEN)
...
java     7853 root   27u  IPv6 15051913      0t0  TCP 10.23.46.36:2181->10.23.46.36:26888 (ESTABLISHED)
css     29804 root  113u  IPv4 15051910      0t0  TCP 10.23.46.36:26888->10.23.46.36:2181 (ESTABLISHED)
既然 TCP 连接(即 IPv6 协议栈能够正确处理 IPv4 连接)没有问题,那么接下来的问题就是,出现问题的连接和其他的连接有和不同? 

实验如下
[root@upu_2 zookeeper]# lsof -i :2181 -P
COMMAND   PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java     7853 root   16u  IPv6    21294      0t0  TCP *:2181 (LISTEN)
java     7853 root   19u  IPv6 11514633      0t0  TCP 10.23.46.36:2181->10.23.46.39:26570 (ESTABLISHED)
java     7853 root   20u  IPv6 11514635      0t0  TCP 10.23.46.36:2181->10.23.46.42:37657 (ESTABLISHED)
java     7853 root   21u  IPv6 11515884      0t0  TCP 10.23.46.36:2181->10.23.46.43:33259 (ESTABLISHED)
java     7853 root   22u  IPv6 11515899      0t0  TCP 10.23.46.36:2181->10.23.46.39:26577 (ESTABLISHED)
java     7853 root   23u  IPv6 11516014      0t0  TCP 10.23.46.36:2181->10.23.46.39:26581 (ESTABLISHED)
java     7853 root   24u  IPv6 11516017      0t0  TCP 10.23.46.36:2181->10.23.46.39:26584 (ESTABLISHED)
java     7853 root   25u  IPv6 11516068      0t0  TCP 10.23.46.36:2181->10.23.46.39:26586 (ESTABLISHED)
java     7853 root   26u  IPv6 11516069      0t0  TCP 10.23.46.36:2181->10.23.46.39:26587 (ESTABLISHED)
java     7853 root   27u  IPv6 15051913      0t0  TCP 10.23.46.36:2181->10.23.46.36:26888 (ESTABLISHED)
css     29804 root  113u  IPv4 15051910      0t0  TCP 10.23.46.36:26888->10.23.46.36:2181 (ESTABLISHED)
[root@upu_2 zookeeper]# 


[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26570 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:41:40.342746 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [P.], seq 2527560660:2527560664, ack 198355173, win 269, options [nop,nop,TS val 170901466 ecr 170833048], length 4
14:41:40.342764 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 269, options [nop,nop,TS val 170901466 ecr 170833048], length 8
14:41:40.343032 IP 10.23.46.36.eforward > 10.23.46.39.26570: Flags [.], ack 12, win 114, options [nop,nop,TS val 170836385 ecr 170901466], length 0
14:41:40.343332 IP 10.23.46.36.eforward > 10.23.46.39.26570: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170836385 ecr 170901466], length 20
14:41:40.343418 IP 10.23.46.39.26570 > 10.23.46.36.eforward: Flags [.], ack 21, win 269, options [nop,nop,TS val 170901467 ecr 170836385], length 0
^C
5 packets captured
11 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 37657 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:41:50.340486 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [P.], seq 4258153112:4258153116, ack 4088319963, win 18, length 4
14:41:50.340501 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 18, length 8
14:41:50.340625 IP 10.23.46.36.eforward > 10.23.46.42.37657: Flags [.], ack 12, win 537, length 0
14:41:50.340798 IP 10.23.46.36.eforward > 10.23.46.42.37657: Flags [P.], seq 1:21, ack 12, win 537, length 20
14:41:50.340927 IP 10.23.46.42.37657 > 10.23.46.36.eforward: Flags [.], ack 21, win 18, length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 33259 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:00.802239 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [P.], seq 3634190124:3634190128, ack 3831351430, win 18, length 4
14:42:00.802258 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 18, length 8
14:42:00.802377 IP 10.23.46.36.eforward > 10.23.46.43.33259: Flags [.], ack 12, win 501, length 0
14:42:00.802562 IP 10.23.46.36.eforward > 10.23.46.43.33259: Flags [P.], seq 1:21, ack 12, win 501, length 20
14:42:00.802691 IP 10.23.46.43.33259 > 10.23.46.36.eforward: Flags [.], ack 21, win 18, length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26577 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:08.255987 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [P.], seq 245299443:245299447, ack 584627376, win 58, options [nop,nop,TS val 170929380 ecr 170860963], length 4
14:42:08.256011 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 58, options [nop,nop,TS val 170929380 ecr 170860963], length 8
14:42:08.256176 IP 10.23.46.36.eforward > 10.23.46.39.26577: Flags [.], ack 12, win 1539, options [nop,nop,TS val 170864298 ecr 170929380], length 0
14:42:08.256503 IP 10.23.46.36.eforward > 10.23.46.39.26577: Flags [P.], seq 1:21, ack 12, win 1539, options [nop,nop,TS val 170864298 ecr 170929380], length 20
14:42:08.256640 IP 10.23.46.39.26577 > 10.23.46.36.eforward: Flags [.], ack 21, win 58, options [nop,nop,TS val 170929380 ecr 170864298], length 0

^C
5 packets captured
6 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26581 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:17.330446 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [P.], seq 311282941:311282945, ack 758925432, win 58, options [nop,nop,TS val 170938454 ecr 170870040], length 4
14:42:17.330466 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 58, options [nop,nop,TS val 170938454 ecr 170870040], length 8
14:42:17.330588 IP 10.23.46.36.eforward > 10.23.46.39.26581: Flags [.], ack 12, win 114, options [nop,nop,TS val 170873373 ecr 170938454], length 0
14:42:17.330816 IP 10.23.46.36.eforward > 10.23.46.39.26581: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170873373 ecr 170938454], length 20
14:42:17.330976 IP 10.23.46.39.26581 > 10.23.46.36.eforward: Flags [.], ack 21, win 58, options [nop,nop,TS val 170938455 ecr 170873373], length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26584 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:25.784667 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [P.], seq 2592715739:2592715743, ack 2440448357, win 58, options [nop,nop,TS val 170946908 ecr 170880076], length 4
14:42:25.784688 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [P.], seq 4:109, ack 1, win 58, options [nop,nop,TS val 170946908 ecr 170880076], length 105
14:42:25.784825 IP 10.23.46.36.eforward > 10.23.46.39.26584: Flags [.], ack 109, win 114, options [nop,nop,TS val 170881827 ecr 170946908], length 0
14:42:25.789914 IP 10.23.46.36.eforward > 10.23.46.39.26584: Flags [P.], seq 1:89, ack 109, win 114, options [nop,nop,TS val 170881832 ecr 170946908], length 88
14:42:25.790030 IP 10.23.46.39.26584 > 10.23.46.36.eforward: Flags [.], ack 89, win 58, options [nop,nop,TS val 170946914 ecr 170881832], length 0

^C
5 packets captured
12 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26586 -s 0          
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:37.566211 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [P.], seq 1062233276:1062233280, ack 4128653018, win 769, options [nop,nop,TS val 170958690 ecr 170890272], length 4
14:42:37.566230 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [P.], seq 4:12, ack 1, win 769, options [nop,nop,TS val 170958690 ecr 170890272], length 8
14:42:37.566348 IP 10.23.46.36.eforward > 10.23.46.39.26586: Flags [.], ack 12, win 114, options [nop,nop,TS val 170893608 ecr 170958690], length 0
14:42:37.566500 IP 10.23.46.36.eforward > 10.23.46.39.26586: Flags [P.], seq 1:21, ack 12, win 114, options [nop,nop,TS val 170893608 ecr 170958690], length 20
14:42:37.566641 IP 10.23.46.39.26586 > 10.23.46.36.eforward: Flags [.], ack 21, win 769, options [nop,nop,TS val 170958690 ecr 170893608], length 0

^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26587 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes
14:42:45.986057 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [P.], seq 335958315:335958381, ack 2473985165, win 114, options [nop,nop,TS val 170902028 ecr 170965365], length 66
14:42:45.986197 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [.], ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 0
14:42:45.986330 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [P.], seq 1:5, ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 4
14:42:45.986346 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [P.], seq 5:52, ack 66, win 771, options [nop,nop,TS val 170967110 ecr 170902028], length 47
14:42:45.986453 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [.], ack 52, win 114, options [nop,nop,TS val 170902028 ecr 170967110], length 0
14:42:45.986671 IP 10.23.46.36.eforward > 10.23.46.39.26587: Flags [P.], seq 66:209, ack 52, win 114, options [nop,nop,TS val 170902029 ecr 170967110], length 143
14:42:46.026053 IP 10.23.46.39.26587 > 10.23.46.36.eforward: Flags [.], ack 209, win 771, options [nop,nop,TS val 170967150 ecr 170902029], length 0

^C
7 packets captured
8 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# 
[root@upu_2 zookeeper]# tcpdump -i eth2 tcp port 26888 -s 0     
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 65535 bytes

(过了很久也没抓到任何东西,后来发现原来是抓错了网口,应该抓 lo 才对,ORZ...)

^C
0 packets captured
1 packets received by filter
0 packets dropped by kernel
[root@upu_2 zookeeper]# ^C
[root@upu_2 zookeeper]#
总之,从实验中可以看到,针对所有的 zookeeper 连接进行抓包,所有连接上都抓到了心跳交互。 

3.怀疑业务与 zookeeper 通过虚地址建立的 TCP 在异常断电时有问题 

从连接信息上看,本地业务通过虚地址连接到了 zookeeper 上。
【原创】zookeeper 使用中的 TCP 链路残留问题 
难道这个情况和虚地址有关系?这个问题目前无法确认,因为测试人员在自己的环境中进行模拟测试时,无法复现该问题。 
上一篇:Shell scripts to Create a local dir base on the time.


下一篇:git push 失败出现error: src refspec master does not match any.解决方案