现象:
tcp 0 0 ::ffff:192.168.1.12:59103 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59085 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59331 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:46381 ::ffff:192.168.1.104:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59034 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59383 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59138 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59407 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59288 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:58905 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:58867 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:58891 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59334 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:46129 ::ffff:192.168.1.100:3306 TIME_WAIT timewait (0.00/0/0)
tcp 0 0 ::ffff:192.168.1.12:59143 ::ffff:192.168.1.11:3306 TIME_WAIT timewait (0.00/0/0)
通过检查 sysctl.conf,我们看到所有的配置均为默认,于是尝试如下修改。其实这个修改,应该说是在分析得不够精准的情况下做的判断。
因为在服务端出现大量的 timewait,说明是服务端主动断开的 TCP 连接。
而我们处理这样的连接,无非就是释放服务端的句柄和内存资源,但是不能释放端口,因为服务端只开了一个 listen 端口。
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 3
net.ipv4.tcp_keepalive_time = 3
通过上述处理后,问题依旧。
通过dmesg可以看到如下信息:
Nov 4 11:35:48 localhost kernel: __ratelimit: 108 callbacks suppressed
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:48 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:53 localhost kernel: __ratelimit: 592 callbacks suppressed
Nov 4 11:35:53 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:53 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:57 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: __ratelimit: 281 callbacks suppressed
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:35:58 localhost kernel: nf_conntrack: table full, dropping packet.
Nov 4 11:36:14 localhost kernel: __ratelimit: 7 callbacks suppressed
在 nf_conntrack 模块中,实现了对连接跟踪。它利用 netfilter 框架中的 nf_register_hook/nf_unregister_hook 函数来注册钩子项,调用 nf_conntrack_in 来建立相应连接,ipv4_conntrack_in 挂载在 NF_IP_PRE_ROUTEING 点上(该函数主要实现了创建连接),从而实现连接跟踪。
然后就开始处理nf_conntrack: table full的问题:
1、通过配置参数解决问题
net.netfilter.nf_conntrack_max
//是允许的最大跟踪连接条目,是在内核内存中netfilter可以同时处理的“任务”。
net.netfilter.nf_conntrack_tcp_timeout_established
//是TCP连接创建时的超时时间。
2、通过关闭防火墙来解决问题