一、问题背景
- 当前Cluster K8s Version: v1.17.4
- 需要升级到K8s Version:v1.19.3
- 在升级过程中,有个Pod卡在ContainerCreating状态
api-9flnb 0/1 ContainerCreating 0 4d19h api-bb8th 1/1 Running 0 4d20h api-zwtpp 1/1 Running 0 4d20h
二、问题分析
- Describe该Pod状态,提示hostPath type check failed: /var/run/docker.sock is not a file
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 11m (x3543 over 4d18h) kubelet (combined from similar events): Unable to attach or mount volumes: unmounted volumes=[docker-socket], unattached volumes=[xxxx-service-account-token-rjqz7 nginx-certs host-timezone docker-socket helm-home etc-pki kubernetes-root-ca-file root-ca-file bcmt-home etcd-client-certs]: timed out waiting for the condition Warning FailedMount 2m39s (x2889 over 4d19h) kubelet MountVolume.SetUp failed for volume "docker-socket" : hostPath type check failed: /var/run/docker.sock is not a file
- 查看该Pod中volume "docker-socket"的声明,Path是/var/run/docker.sock,HostPathType是File
Volumes: etcd-client-certs: Type: Secret (a volume populated by a Secret) SecretName: etcd-client-certs Optional: false nginx-certs: Type: HostPath (bare host directory volume) Path: /opt/bcmt/config/bcmt-api/certs HostPathType: Directory docker-socket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: File
- 查看K8s从v1.17.4到v1.19.3关于Type检测方面的相关代码变化
首先,报错的代码函数是checkTypeInternal(),在文件host_path.go定义,会判断HostPathType和实际Type是否一致,否则报错。
func checkTypeInternal(ftc hostPathTypeChecker, pathType *v1.HostPathType) error { switch *pathType { case v1.HostPathDirectoryOrCreate: if !ftc.Exists() { return ftc.MakeDir() } fallthrough case v1.HostPathDirectory: if !ftc.IsDir() { return fmt.Errorf("hostPath type check failed: %s is not a directory", ftc.GetPath()) } case v1.HostPathFileOrCreate: if !ftc.Exists() { return ftc.MakeFile() } fallthrough case v1.HostPathFile: if !ftc.IsFile() { return fmt.Errorf("hostPath type check failed: %s is not a file", ftc.GetPath()) } case v1.HostPathSocket: if !ftc.IsSocket() { return fmt.Errorf("hostPath type check failed: %s is not a socket file", ftc.GetPath()) } case v1.HostPathCharDev: if !ftc.IsChar() { return fmt.Errorf("hostPath type check failed: %s is not a character device", ftc.GetPath()) } case v1.HostPathBlockDev: if !ftc.IsBlock() { return fmt.Errorf("hostPath type check failed: %s is not a block device", ftc.GetPath()) } default: return fmt.Errorf("%s is an invalid volume type", *pathType) } return nil }
然后,结合我们pod的定义,HostPathType是File,但是实际Path文件/var/run/docker.sock应该是Socket,所以报错是正确的。疑问在于,为什么v1.17.4没有报错(亲测v1.18.x也不会报错),而到了v1.19.3才开始报错???
- 检查checkTypeInternal函数代码有无改动,---> 结果是无改动
- 检查入参hostPathTypeChecker传值是否有改动, --->发现有改动
- v1.17.x和v1.18.x中,IsFile()定义如下
-
func (ftc *fileTypeChecker) IsFile() bool { if !ftc.Exists() { return false } return !ftc.IsDir() }
- v1.19.x开始,IsFile()更新如下
func (ftc *fileTypeChecker) IsFile() bool { if !ftc.Exists() { return false } pathType, err := ftc.hu.GetFileType(ftc.path) if err != nil { return false } return string(pathType) == string(v1.HostPathFile) }
三、问题结论
- K8s从v1.19.x修复了IsFile()函数检测功能不完备的Bug;
- 我们的Pod Mount Volume 文件.sock时指定HostPathType错误(应该是Socket, 不应该是File),但是在v1.19.x之前因为k8s的bug正好将错就错反而没有问题,等v1.19.x修复了该Bug就会出现Volume Mount失败的问题
四、解决方案:.sock文件的HostPathType要设置成Socket
Volumes: docker-socket: Type: HostPath (bare host directory volume) Path: /var/run/docker.sock HostPathType: Socket