k8s设计了网络模型,但是把实现交给了网络插件,而CNI网络插件实现的最主要的功能就是POD跨宿主机资源互相访问
flannel安装:
hdss7-21和hdss7-22 两个几点操作:
wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz mkdir /opt/flannel-v0.11.0 tar xf flannel-v0.11.0-linux-amd64.tar.gz -C /opt/flannel-v0.11.0/ ln -s /opt/flannel-v0.11.0/ flannel mkdir cert
拷贝证书:
cert]# scp dc2-user@hdss7-200:/opt/certs/ca.pem . scp dc2-user@hdss7-200:/opt/certs/client.pem .
编辑配置文件和启动脚本
[root@hdss7-21
flannel]# cat subnet.env FLANNEL_NETWORK=172.7.0.0/16 FLANNEL_SUBNET=172.7.21.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=false [root@hdss7-21 flannel]# cat flanneld.sh #!/bin/sh ./flanneld \ --public-ip=10.4.7.21 \ --etcd-endpoints=https://10.4.7.12:2379,https://10.4.7.21:2379,https://10.4.7.22:2379 \ --etcd-keyfile=./cert/client-key.pem \ --etcd-certfile=./cert/client.pem \ --etcd-cafile=./cert/ca.pem \ --iface=eth0 \ --subnet-file=./subnet.env \ --healthz-port=2401
创建log目录:
mkdir -p /data/logs/flanneld
在etcd中创建配置,声明flannel使用的网络模型:
etcd]# ./etcdctl set /coreos.com/netwo
rk/config '{"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}}' etcd]# ./etcdctl get /coreos.com/network/config {"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}}
配置supervisior配置文件并重载配置:
[root@hdss7-21 ~]# cat /etc/supervisord.d/flannel.ini [program:flanneld-7-21] command=/opt/flannel/flanneld.sh ; the program (relative uses PATH, can take args) numprocs=1 ; number of processes copies to start (def 1) directory=/opt/flannel ; directory to cwd to before exec (def no cwd) autostart=true ; start at supervisord start (default: true) autorestart=true ; retstart at unexpected quit (default: true) startsecs=30 ; number of secs prog must stay running (def. 1) startretries=3 ; max # of serial start failures (default 3) exitcodes=0,2 ; 'expected' exit codes for process (default 0,2) stopsignal=QUIT ; signal used to kill process (default TERM) stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10) user=root ; setuid to this UNIX account to run the program redirect_stderr=true ; redirect proc stderr to stdout (default false) stdout_logfile=/data/logs/flanneld/flanneld.stdout.log ; stderr log path, NONE for none; default AUTO stdout_logfile_maxbytes=64MB ; max # logfile bytes b4 rotation (default 50MB) stdout_logfile_backups=4 ; # of stdout logfile backups (default 10) stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0) stdout_events_enabled=false ; emit events on stdout writes (default false) [root@hdss7-22 cert]# supervisorctl update flanneld-7-22: added process group [root@hdss7-22 cert]# supervisorctl status etcd-server-7-22 RUNNING pid 3093, uptime 13:12:04 flanneld-7-22 RUNNING pid 379, uptime 0:00:36 kube-apiserver-7-22 RUNNING pid 3090, uptime 13:12:04 kube-controller-manager-7-22 RUNNING pid 3092, uptime 13:12:04 kube-kubelet-7-22 RUNNING pid 3089, uptime 13:12:04 kube-proxy-7-22 RUNNING pid 3091, uptime 13:12:04 kube-scheduler-7-22 RUNNING pid 3095, uptime 13:12:04
此时你可以跨节点ping通pod地址了
[root@hdss7-21 ~]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-ds-rxfqd 1/1 Running 0 129m 172.7.22.2 hdss7-22.host.com <none> <none> nginx-ds-xm5l2 1/1 Running 0 151m 172.7.21.2 hdss7-21.host.com <none> <none> [root@hdss7-21 ~]# ping 172.7.22.2 PING 172.7.22.2 (172.7.22.2) 56(84) bytes of data. 64 bytes from 172.7.22.2: icmp_seq=1 ttl=63 time=2.39 ms 64 bytes from 172.7.22.2: icmp_seq=2 ttl=63 time=1.19 ms
查看路由规则可以看到,flannel帮我们添加了静态路由,实际上,在flannel的host-gw网络模型中,flannel仅仅只是帮我们在每一台宿主机上做了这样一件事情而已,所以他的效率也是非常高的,但是host-gw网络模型,仅支持我们的宿主机在同一个2层网络下(即宿主机的网关指向同一地址),如果不通的2层网络打通的话就需要用到Vxlan模型的网络了:
[root@hdss7-21 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 10.4.0.1 0.0.0.0 UG 0 0 0 eth0 10.4.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 eth0 172.7.21.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 172.7.22.0 10.4.7.22 255.255.255.0 UG 0 0 0 eth0
flannel三种网络模型设置如下:
宿主机在同一网络
'{"Network": "172.7.0.0/16", "Backend": {"Type": "host-gw"}}'
不在同一网络
'{"Network": "172.7.0.0/16", "Backend": {"Type": "VxLAN"}}'
由flannel来判断宿主机是否在同一2层网络下,直接路由模型
'{"Network": "172.7.0.0/16", "Backend": {"Type": "VxLAN","Directrouting": true}}'
flannel的SNAT规则优化:
默认情况下,我们的跨宿主机pod之前容器的互相通信,会经过iptables的原地址转换,这就导致一个问题,我们在相同的二层网络下,POD之前互相通信,却不知道到底是哪一个pod来访问我的,可想而知,也不利于问题的排查
安装iptables-services
~]# yum install iptables-services -y [root@hdss7-21 ~]# systemctl start iptables [root@hdss7-21 ~]# systemctl enable iptables Created symlink from /etc/systemd/system/basic.target.wants/iptables.service to /usr/lib/systemd/system/iptables.service.
删除snat规则,重新添加:
[root@hdss7-21 ~]# iptables-save |grep POSTROUTING :POSTROUTING ACCEPT [72:3710] :KUBE-POSTROUTING - [0:0] -A POSTROUTING -s 172.7.21.0/24 ! -o docker0 -j MASQUERADE ~]# iptables -t nat -D POSTROUTING -s 172.7.21.0/24 ! -o docker0 -j MASQUERADE ~]# iptables -t nat -I POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE ~]# iptables-save |grep -i postrouting :POSTROUTING ACCEPT [32:1647] :KUBE-POSTROUTING - [0:0] -A POSTROUTING -s 172.7.21.0/24 ! -d 172.7.0.0/16 ! -o docker0 -j MASQUERADE
保存iptables规则:
~]# iptables -t filter -D INPUT -j REJECT --reject-with icmp-host-prohibited ~]# iptables -t filter -D FORWARD -j REJECT --reject-with icmp-host-prohibited ~]# iptables-save > /etc/sysconfig/iptables