百万长连接并发的限制因素
(1) CPU:使用top,然后按1查看,如果有逻辑CPU跑到100%,那就是受限了。多线程或线程绑定CPU都可以;
(2) 内存:本文主要讨论内存限制;
针对长连接来讲,监听和connect的过程是几乎不消耗内存的。内存主要消耗在滑动窗口的读写缓存上。
使用附录中的代码,可以看到默认一个TCP连接占用的内存大小有多大:
[root@localhost test]# ./gtop
recv_buf = 85k
send_buf = 16k
可以看到一个TCP连接,默认占用的内存大小=85k+16k=101KB,那么我们可以计算了32GB内存,按照内存使用率70%来算,应该能支持的
稳定长连接数=32GB*70%/101KB=232,555≈23万
如果根据自己的业务,我们可以调整这个缓冲区大小。
比如,我们的场景是简历长连接保持会话,每个报文1KB左右,1min发一次,显然用不到这么大的发送和接收缓冲区。
可以从程序中设置:(建议从程序中设置)
//c++代码
int nRecvBuf=16*1024;
setsockopt(s,SOL_SOCKET,SO_RCVBUF,(const char*)&nRecvBuf,sizeof(int));
int nSendBuf=32*1024;
setsockopt(s,SOL_SOCKET,SO_SNDBUF,(const char*)&nSendBuf,sizeof(int));
或者在系统参数中设置:/etc/sysctl.conf(比较不建议这样设置)
#/etc/sysctl.conf
net.ipv4.tcp_rmem = 4096 87380 2063281
net.ipv4.tcp_wmem = 4096 16384 2063281
net.core.wmem_default = 388608
net.core.rmem_default = 388608
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
真正起作用的是87380、16384两个参数所在的位置。设置之后用sysctl -p同步
[root@localhost test]# sysctl -p
同步之后,可以用刚才那个gtop再取一下参数看是否生效了。
附录一:报错信息
未调内存之前压力测试30万的时候,发现内存一点点减少,但是使用ps和top等工具并没有查看到任何耗费内存比较多的进程或线程。
当系统内存低于600MB时,系统宕机(并非真的死机,ssh连不上,usb串口连不上),报错如下
Centos Linux ( )
Kernel 3 ,10.0-514.e17.64 an
loadtest3 login :[ 16221.310569]Out of memory :Kill process 1183 ( tuned)score or sacrif ice child
[ 16221.3406031Killed process 1183 ( tuned)total-vm :562728kB,anon-rss :0kB ,file-rss :0kB ,shmem-rss :0kB
[ 16221.357102]of memory :Kill process 926 ( )score or sacrif ice child
[ 16221.357130]Killed process 926 ( polkitd)total-vm :538128kB ,anon-rss :0kB ,file-rss :0kB ,shmem-rss :
[ 16221.3599841of memory :Kill process 932 ( gmain)score or sacrifice child
[ 16221.360826]Killed process 932 ( gmain)total-vm :538428kB ,anon-rss :0kB ,file-rss :0kB ,shmem-rss :
[ 16221.3627391of memory :Kill process 33234 ( dstat)score 0 or sacrifice child
[ 16221.362766]Killed process 33234 ( dstat)total-vm :150232kB ,anon-rss :0kB ,file-rss :0kB,shmem-rss
[ 16230,318655]Out of memory :process 937 ( NetworkMlanagerscore or sacrif ice child
6230.310689 ]Killed process 937 ( NetworkManager)total- :,anon-rss :0kB ,file-rss :,shm
6230.311222 ]Out of memory :Kill process 948 ( gdbus)score or sacrifice child
16230.311222
16238.3112501K
Killed process ( gdbus)total-um :450644kB ,anon-rss :,file-rss :4kB ,shmem-rss :0kB
16448.675291
75291 ]INFO :task /::31 blocked for more than 120 seconds
[16448.675316]
53161 echo > /proc/sys/kernel/hung_task_timeout_secs"disables this messag
16448.675401
[16448.675426
401 ]INFO :task fsnotify mark :189 blocked for more than seconds
than seconds
16448.675494
"echo> /proc/sys/kernel/hungtask_timeoutsecs "disables this message
[16448.675518
4 ]INFO :task kworker /:1 :209 blocked for more than 120 seconds
16448.675588
] "echo0 > /proc/sys/kernel/hung_task_timeout_secs"disables this message
16448.675611
INFO :task gdbus :948 blocked for more than seconds
[16448.675789]
echo 0 > /proc/sys/kernel/hung_task_timeout_secs"disables this message
[16448.675734]
INFO :task kworker /:2 :blocked for more than seconds
[16568.679395
echo > /proc/sys/kerne/hung task timeout secs "disables this messag
[16568.679424
INFO :task kworker /4:0 :blocked for more than 120 seconds
[16568.679522
echo > /proc/sys/kernel/hungtask_timeout_secs"disables this messa
16568.679553]
INFO :task fsnotify mark :189 blocked for more than 120 seconds
16568.679639
echo > /proc/sys/kernel/hungtasktimeout _secs"disables this message
[16568.679669
INFO :task kworker /u96:1 :209 blocked for more than seconds
[16568.679760
echo > /proc/sys//hungtask "disables this message
[16568.679788
INFO :task gdbus :948 blocked for more than 120 seconds
[16568.6798931
echo 0 > /proc/sys/kernel/hungtask_timeout_secs"disables this message
[16568.679923
INFO :task kworker /4:2 :16027 blocked for more than 120 seconds
] "echo0 > /proc/sys/kernel/hung_task_timeout_secs"disables this message
附录二:查询opt代码
编译命令:g++ main.cpp -o gtop
#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <poll.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <sys/epoll.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <string>
#define MAXLINE 4096
#define OPEN_MAX 16
#define SERV_PORT 1555
const int EPOLL_MAX_FDSIZE = 0x4000;
int main()
{
int i , maxi ,listenfd , connfd , sockfd ,epfd, nfds;
int n;
char buf[MAXLINE];
struct epoll_event ev, events[EPOLL_MAX_FDSIZE];
socklen_t clilen;
struct pollfd client[OPEN_MAX];
struct sockaddr_in cliaddr , servaddr;
listenfd = socket(AF_INET , SOCK_STREAM , 0);
memset(&servaddr,0,sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERV_PORT);
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
int opt_rc_val;
socklen_t opt_rc_len = sizeof(opt_rc_val);
int opt_sd_val;
socklen_t opt_sd_len = sizeof(opt_sd_val);
getsockopt(listenfd,SOL_SOCKET,SO_RCVBUF,&opt_rc_val, &opt_rc_len);
getsockopt(listenfd,SOL_SOCKET,SO_SNDBUF,&opt_sd_val, &opt_sd_len);
printf("recv_buf = %dk\n", opt_rc_val / 1024);
printf("send_buf = %dk\n", opt_sd_val / 1024);
return 0;
}
附录三:参考网址
http://www.net-add.com/devops/sre/cdn/28.html