系统:Ubuntu 18.04.02
K8s版本:1.13.4
故障现象:安装KubeDNS后,Pod内无法ping通外网域名,访问外网IP、K8s内部域名或者IP均正常
原因分析:查看Pod中的resolv.conf:
kubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
10.96.0.10为KubeDNS的集群IP,对于内部域名,KubeDNS会直接解析,对于外部域名,KubeDNS会丢给上一级DNS服务器解析。
查看KubeDNS Pod中的resolv.conf:
kubectl -n kube-system exec kube-dns-57f56f74cb-s86k7 -- cat /etc/resolv.conf
Defaulting container name to kubedns.
Use 'kubectl describe pod/kube-dns-57f56f74cb-s86k7 -n kube-system' to see all of the containers in this pod.
nameserver 127.0.0.53
options edns0
原因很明显了,KubeDNS创建时会把宿主机的/etc/resolv.conf里的内容拷贝到Pod同文件中,如果/etc/resolv.conf里写的配置不正确,则Pod无法解析外网域名。
Ubuntu18.04已经抛弃/etc/resolv.conf用做域名解析,DNS可以配置在/etc/netplan/xx.yaml中,保留/etc/resolv.conf只是用做兼容,查看该文件cat /etc/resolv.conf:
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "systemd-resolve --status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 127.0.0.53
options edns0
根据注释,我们知道,/etc/resolv.conf由systemd-resolved服务管理,不建议手工修改,因为会被自动覆盖,同时ls该文件,发现/etc/resolv.conf只不过是一个软链接
网上方法:修改/etc/systemd/resolved.conf中的DNS项,之后重启systemd-resolved服务,经验证无效。
最终解决办法:删除该软链接,然后自己手工创建该文件
rm /etc/resolv.conf -f
cat /etc/resolv.conf<<EOF
nameserver 114.114.114.114
nameserver 114.114.115.115
EOF
带来的问题:unable to resolve host xxx,解决办法:编辑/etc/hosts,把你的主机名加到127.0.0.1即可
删除Pod之后重新创建Pod,问题完美解决
注意:KubeDNS或者CoreDNS在修改/etc/resolv.conf前已经创建,也必须删除之后重建