首先说一下我的环境和配置:阿里云1核2G,系统是Ubuntu18.04(最好是2核,因为master有限制),node也是1核2G
好了开始进入正题吧
1,更新系统源
如果系统本身自带得镜像地址,服务器在国外,下载速度会很慢,可以打开 /etc/apt/sources.list
替换为国内得镜像源。
apt upgrade
2,更新软件包
将系统得软件组件更新至最新稳定版本。
apt update
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libcurl4
The following packages will be upgraded:
curl libcurl4
2 upgraded, 0 newly installed, 0 to remove and 46 not upgraded.
Need to get 378 kB of archives.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Ign:1 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 curl amd64 7.58.0-2ubuntu3.14
Ign:2 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 libcurl4 amd64 7.58.0-2ubuntu3.14
Err:1 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 curl amd64 7.58.0-2ubuntu3.14
404 Not Found [IP: 100.100.2.148 80]
Err:2 http://mirrors.cloud.aliyuncs.com/ubuntu bionic-updates/main amd64 libcurl4 amd64 7.58.0-2ubuntu3.14
404 Not Found [IP: 100.100.2.148 80]
E: Failed to fetch http://mirrors.cloud.aliyuncs.com/ubuntu/pool/main/c/curl/curl_7.58.0-2ubuntu3.14_amd64.deb 404 Not Found [IP: 100.100.2.148 80]
E: Failed to fetch http://mirrors.cloud.aliyuncs.com/ubuntu/pool/main/c/curl/libcurl4_7.58.0-2ubuntu3.14_amd64.deb 404 Not Found [IP: 100.100.2.148 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
不更新的话会遇到这个问题,所以记得更新哦,而且上边已给了提示run apt-get update or try with --fix-missing
3,安装 Docker
也可以参考其它过程安装
apt-get install docker.io
如果需要配置为开机启动,可执行以下命令
systemcd enable docker
systemcd start docker
如果要配置 Docker 镜像加速,打开 /etc/docker/daemon.json
文件,registry-mirrors 增加或修改,加入https://registry.docker-cn.com
这个地址,也可以填写阿里云腾讯云等镜像加速地址。
示例
{
"registry-mirrors": [
"https://registry.docker-cn.com"
]
}
重启 Docker,使配置生效
sudo systemctl daemon-reload
sudo systemctl restart docker
4,安装 K8S
执行以下命令安装 https 工具以及 k8s。
apt-get update && apt-get install -y apt-transport-https curl
apt-get install -y kubelet kubeadm kubectl --allow-unauthenticated
#常用命令
重启kubelet服务:
systemctl daemon-reload
systemctl restart kubelet
sudo systemctl restart kubelet.service
sudo systemctl daemon-reload
sudo systemctl stop kubelet
sudo systemctl enable kubelet
sudo systemctl start kubelet
执行下面命令测试是否正常
kubeadm version
#结果示例
kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:37:34Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
禁用 swapoff
# 暂时关闭SWAP分区
swapoff -a
# 永久禁用SWAP分区
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
将系统中桥接的IPv4以及IPv6的流量串通:
cat >/etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
这一步安装curl时可能会遇到这个问题
The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
Reading package lists... Done
W: GPG error: https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
E: The repository 'https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
只需执行即可(key就是NO_PUBKEY后的值,根据你自己的key进行替换)
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys FEEA9169307EA071
如果安装时,出现下面情况,说明系统得镜像源中,找不到 k8s 的软件包。
No apt package "kubeadm", but there is a snap with that name.
Try "snap install kubeadm"
No apt package "kubectl", but there is a snap with that name.
Try "snap install kubectl"
No apt package "kubelet", but there is a snap with that name.
Try "snap install kubelet"
可以打开 /etc/apt/sources.list
文件,添加一行
deb https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial main
再次执行安装 K8s 的命令。
如果出现
The following signatures couldn't be verified because the public key is not available
则执行下面命令,为期添加 key。
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add
上面命令,安装了 kubelet
、kubeadm
、kubectl
,kubelet
是 k8s 相关服务,kubectl
是 k8s
管理客户端,kubeadm
是部署工具。
如果只是node的话到这里就可以了
另一台阿里云加入集群只需执行(这个在下面会告诉你怎么弄出来的,等全看完再回来搞就行)
kubeadm join 39.96.46.96:6443 --token 9vbzuf.vtzj1w5vefjlwi0t --discovery-token-ca-cert-hash sha256:b6e6fffb6b0e11d2db374ce21f6d86de3e09e1e13075e1bf01055130c2c5e060
可能会遇到的错
[kubelet-check] Initial timeout of 40s passed.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher
#解决:
swapoff -a
kubeadm reset
systemctl daemon-reload
systemctl restart kubelet
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
#再次执行join命令,node加入成功
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
root@ubuntu:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 26m v1.22.2
node Ready <none> 15s v1.22.2
5,初始化
执行下面命令进行初始化,会自动从网络中下载需要的 Docker 镜像。
此命令是用来部署主节点的Master。
执行 kubeadm version
查看版本,GitVersion:"v1.17.2"
中即为版本号。
执行以下命令初始化(记得把ip换了)
kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96
--ignore-preflight-errors=NumCPU
是在只有一个 CPU 的时候使用,例如 1G1M 的学生服务器。
但是因为需要连接到 Google ,所以可能无法下载内容。
我们可以通过使用 kubeadm config images list
命令,列举需要拉取的镜像。我们来手动通过 Docker 拉取。这个过程比较麻烦,还需要手动修改镜像名称。
拉取方法 docker pull {镜像名称}
。
Google 访问不了,不过 DockerHub 已经备份好需要的镜像。
mirrorgooglecontainers 这个仓库备份了相应的镜像。遗憾的是,镜像不一定都是最新的备份。阿里云上面的 google_containers 仓库应该是备份最新的。
例如需要以下镜像
k8s.gcr.io/kube-apiserver:v1.22.2
k8s.gcr.io/kube-controller-manager:v1.22.2
k8s.gcr.io/kube-scheduler:v1.22.2
k8s.gcr.io/kube-proxy:v1.22.2
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns:1.8.4
则拉取对应的镜像
docker pull mirrorgooglecontainers/kube-apiserver:v1.22.2
docker pull mirrorgooglecontainers/kube-controller-manager:v1.22.2
docker pull mirrorgooglecontainers/kube-scheduler:v1.22.2
docker pull mirrorgooglecontainers/kube-proxy:v1.22.2
docker pull mirrorgooglecontainers/pause:3.5
docker pull mirrorgooglecontainers/etcd:3.5.0-0
docker pull coredns/coredns:1.8.4
使用 docker tag {旧名称:版本}:{新名称:版本}
,将镜像改名。
考虑到各种情况和可能会出现问题,笔者这里给出一个别人写的一键脚本,可以直接一键完成这一步。
touch pullk8s.sh # 创建脚本文件
nano pullk8s.sh # 编辑脚本
然后将以下内容复制进去
for i in `kubeadm config images list`; do
imageName=${i#k8s.gcr.io/}
docker pull registry.aliyuncs.com/google_containers/$imageName
docker tag registry.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
docker rmi registry.aliyuncs.com/google_containers/$imageName
done;
保存文件
Ctrl + O
回车键
Ctrl + x
给脚本文件赋权限
chmod +x pullk8s.sh
执行脚本
sh pullk8s.sh
然后执行 docker images
命令查看需要的镜像是否都准备好了。
root@ubuntu:~# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.22.2 cba2a99699bd 2 weeks ago 116MB
k8s.gcr.io/kube-apiserver v1.22.2 41ef50a5f06a 2 weeks ago 171MB
k8s.gcr.io/kube-controller-manager v1.22.2 da5fd66c4068 2 weeks ago 161MB
k8s.gcr.io/kube-scheduler v1.22.2 f52d4c527ef2 2 weeks ago 94.4MB
k8s.gcr.io/coredns 1.8.4 70f311871ae1 3 months ago 41.6MB
k8s.gcr.io/etcd 3.5.0-0 303ce5db0e90 3 months ago 288MB
k8s.gcr.io/pause 3.5 da86e6ba6ca1 2 years ago 742kB
也可能会报错,报错的话就手动拉取
Error response from daemon: pull access denied for registry.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Error response from daemon: No such image: registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.4
Error: No such image: registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.4
docker pull coredns/coredns:1.8.4
#镜像改名命令格式:
docker tag 旧镜像名 新镜像名
最后执行 开头的初始化命令。
kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96
因为阿里云ecs里没有配置公网ip,etcd无法启动,所以kubeadm在初始化会出现”timeout“的错误。
解决办法:
1.建立两个ssh对话,即用ssh工具新建两个标签,一个用来初始化节点,另一个在初始化过程中修改配置文件。 注意是初始化过程中,每次运行kubeadm init,kubeadm都会生成etcd的配置文件,如果提前修改了配置文件,在运行kubeadm init时会把修改的结果覆盖,那么也就没有作用了。
2.运行”kubeadm init …“上述的初始化命令,此时会卡在
Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed
3.在输入上述命令后,kubeadm即开始了master节点的初始化,但是由于etcd配置文件不正确,所以etcd无法启动,要对该文件进行修改。
文件路径"/etc/kubernetes/manifests/etcd.yaml"。
#对文件这两行进行修改
--listen-client-urls=https://127.0.0.1:2379,https://39.96.46.96:2379
--listen-peer-urls=https://39.96.46.96:2380
#修改后
--listen-client-urls=https://127.0.0.1:2379
--listen-peer-urls=https://127.0.0.1:2380
4.此处"xxx"为公网ip,要关注的是"–listen-client-urls"和"–listen-peer-urls"。需要把"–listen-client-urls"后面的公网ip删除,把"–listen-peer-urls"改为本地的地址。
稍等后master节点初始化就会完成
可能遇到的问题
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher
#执行指令
swapoff -a && kubeadm reset && systemctl daemon-reload && systemctl restart kubelet && iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
#再执行初始化就可以了
kubeadm init --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=NumCPU --apiserver-advertise-address=39.96.46.96
可能遇到的问题
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.4: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
只需打开这个网址https://www.ipaddress.com/,搜索https://k8s.gcr.io得到它的 ip 142.250.113.82,打开本机hosts文件,Linux是
vim /etc/hosts,将上面的网址和ip按下面的形式加入进去即可,不是root用户记得sudo
142.250.113.82 k8s.gcr.io
还是有问题
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
是因为docker和kubernetes所使用的cgroup不一致导致
解决方法
在docker中修改配置文件
cat > /etc/docker/daemon.json <<EOF
{"exec-opts": ["native.cgroupdriver=systemd"]}
EOF
重启docker
systemctl restart docker
之后还是会有问题,这些就简单了报什么错就解决什么(在此我附上我遇到的问题)
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
#解决方法(此处省略了几个,步骤都一样我就不写了)
cd /etc/kubernetes/manifests/
rm kube-apiserver.yaml
[ERROR Port-10250]: Port 10250 is in use
#解决方法(此处省略了几个,步骤都一样我就不写了)
lsof -i:10250
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kubelet 22055 root 27u IPv6 773301 0t0 TCP *:10250 (LISTEN)
kill -9 22055
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
#解决方法
cd /var/lib/etcd/
rm -r member/
再次执行初始化命令就会成功
#成功后的结果
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 39.96.46.96:6443 --token 9vbzuf.vtzj1w5vefjlwi0t \
--discovery-token-ca-cert-hash sha256:b6e6fffb6b0e11d2db374ce21f6d86de3e09e1e13075e1bf01055130c2c5e060
在master节点上执行如下
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
#检查 master
kubectl get nodes
root@ubuntu:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady control-plane,master 26h v1.22.2
node Ready <none> 15s v1.22.2
#添加网络插件
sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
#结果
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
kubectl get pods --all-namespaces
#如果显示这样,个别的Pod是Pending状态
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcd69978-fkkmh 0/1 Pending 0 17m
kube-system coredns-78fcd69978-qrx2c 0/1 Pending 0 17m
kube-system etcd-ubuntu 1/1 Running 0 17m
kube-system kube-apiserver-ubuntu 1/1 Running 1 (19m ago) 17m
kube-system kube-controller-manager-ubuntu 1/1 Running 2 (20m ago) 17m
kube-system kube-flannel-ds-g97gm 0/1 Init:0/1 0 80s
kube-system kube-proxy-f6ctf 1/1 Running 0 17m
kube-system kube-scheduler-ubuntu 1/1 Running 2 (19m ago) 17m
#只需把 185.199.111.133 raw.githubusercontent.com 加到hosts文件就可以,再次执行就OK了
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcd69978-fkkmh 1/1 Running 0 28m
kube-system coredns-78fcd69978-qrx2c 1/1 Running 0 28m
kube-system etcd-ubuntu 1/1 Running 0 28m
kube-system kube-apiserver-ubuntu 1/1 Running 1 (30m ago) 28m
kube-system kube-controller-manager-ubuntu 1/1 Running 2 (31m ago) 28m
kube-system kube-flannel-ds-g97gm 1/1 Running 0 11m
kube-system kube-proxy-f6ctf 1/1 Running 0 28m
kube-system kube-scheduler-ubuntu 1/1 Running 2 (30m ago) 28m
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 26h v1.22.2
node Ready <none> 15s v1.22.2
此处为止,k8s集群基本安装已完成,因为目前我暂时没有dashboard的需求,所以暂时没有安装,等有需求了我再回来更新哈哈