failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
原因:文件驱动默认由systemd改成cgroupfs, 而我们安装的docker使用的文件驱动是systemd, 造成不一致, 导致镜像无法启动。
查看
docker info
...
Cgroup Driver: systemd
...
现在有两种方式, 一种是修改docker, 另一种是修改kubelet
修改docker:
/etc/docker/daemon.json,加入下面的内容:
Copy
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
重启docker:
systemctl restart docker
systemctl status docker
或者
# vim /lib/systemd/system/docker.service
# 将 --exec-opt native.cgroupdriver=systemd 修改为:
# --exec-opt native.cgroupdriver=cgroupfs
# systemctl daemon-reload
# systemctl restart docker.service
# kubelet显示正常
修改kubelet:
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf添加如下内容
--cgroup-driver=systemd
![](https://s4.51cto.com/images/blog/202007/25/859ccc0f71a9cbdd26d15d86f73c85da.png?x-oss-process=image/watermark,size_16,text_QDUxQ1RP5Y2a5a6i,color_FFFFFF,t_100,g_se,x_10,y_10,shadow_90,type_ZmFuZ3poZW5naGVpdGk=)
# 启动
$ systemctl daemon-reload
$ systemctl enable kubelet && systemctl restart kubelet
2、flannel启动找不到网卡
原因:/etc/systemd/system/flanneld.service启动文件网卡名配置有问题
修改其中的配置
- --iface=eth0
3、进入pod失败
##使用sh进入
kubectl exec -it nginx-deployment-d55b94fd-xcrtg sh
原因:权限有问题,kubelet的配置问题,这里修改node节点的kubelet.json配置
在node中分别修改
vi /opt/kubernetes/cfg/kubelet.config
------------------在文件末尾添加,认证确认
authentication:
anonymous:
enabled: true
----------------
# 然后重启kubelet
systemctl restart kubelet
#在master节点上,添加认证用户,直接使用下列命令实现(这个权限很危险)
kubectl create clusterrolebinding system:anonymous --clusterrole=cluster-admin --user=system:anonymous
#修改为
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user=system:anonymous
4、强制删除k8s不正常状态的容器
1.强制删除特定pods
#kubectl delete pods cloudagile-mariadb-0 -n intelligence-data-lab --grace-period=0 --force
2.删除集群失败的pods
#kubectl get pods --field-selector=status.phase=Failed --all-namespaces |awk ‘{ system("kubectl delete pod “$2” -n "$1) }’
3.强制删除Terminating状态的pods
#kubectl get pods --all-namespaces |grep Terminating||grep -w “0/1”|awk ‘{ system(“kubectl delete pod “$2” -n “$1” --grace-period=0 --force”) }’
5、kubectl: Error from server: error dialing backend: remote error: tls: internal error
原因:使用kubectl logs,发现报了tls的错误,然后查看kubelet的日志,发现报了上面的错误,然后通过命令kubectl get csr查看发现有很多处于pending状态
解决办法
kubectl certificate approve
kubectl get csr | grep Pending | awk ‘{print $1}‘ | xargs kubectl certificate approve
6、ingress-nginxr创建时报Failed to list *v1beta1.Ingress: ingresses.networking.k8s.io is forbidden错误
**错误描述**
Failed to list *v1beta1.Ingress: ingresses.networking.k8s.io is forbidden: User “system:serviceaccount:ingress-nginx:nginx-ingress-serviceaccount” cannot list resource “ingresses” in API group “networking.k8s.io” at the cluster scope
其中Pod nginx-ingress-controller-xxx一直 CrashLoopBackOff
解决办法:
环境:ingress-nginx : 0.25.0
编辑ingress的 mandatory.yaml 在ClusterRole位置添加下述代码然后重新apply -f 即可
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses
verbs:
- list
- watch
7、Deployment.spec.selector.matchLables
描述:spec.mathlabels创建直接报错缺少缺少必要字段selector
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
selector:
matchLabels:
app: my-nginx-add
replicas: 2
template:
metadata:
labels:
app: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1.14