关于 Kubernetes中etcd的一些笔记

写在前面


  • k8s涉及到一些etcd的知识,所以学习后这里系统整理下。
  • 博文内容涉及:

    • 单节点etcd搭建,版本切换
    • etcd集群搭建:用两个节点初始化,然后动态添加一个节点到集群。
    • etcd集群数据快照恢复
    • 为了方便,使用ansible,所以阅读本文需要知道一点ansible

一个人思虑太多,就会失去做人的乐趣。——莎士比亚
*

一、etcd概述

etcdCoreOS团队于2013年6月发起的开源项目,它的目标是构建一个高可用的分布式键值(key-value)数据库etcd内部采用raft协议作为一致性算法,etcd基于Go语言实现。类似于微服务项目中服务注册与发现组件EurekaZookeeperConsul 等。

etcd具有以下特点:
完全复制:集群中的每个节点都可以使用完整的存档
高可用性:Etcd可用于避免硬件的单点故障或网络问题
一致性:每次读取都会返回跨多主机的最新写入
简单:包括一个定义良好、面向用户的API(gRPC)
安全:实现了带有可选的客户端证书身份验证的自动化TLS
快速:每秒10000次写入的基准速度
可靠:使用Raft算法实现了强一致、高可用的服务存储目录

二、单节点ETCD搭建

机器:192.168.26.91

安装etcd包

┌──[root@liruilongs.github.io]-[~]
└─$ yum -y install etcd
┌──[root@liruilongs.github.io]-[~]
└─$ rpm -qc etcd
/etc/etcd/etcd.conf

修改配置文件

┌──[root@liruilongs.github.io]-[~]
└─$ vim $(rpm -qc etcd)
┌──[root@liruilongs.github.io]-[~]
└─$
#[Member]
# 数据位置
ETCD_DATA_DIR="/var/lib/etcd/default.etcd" 
# 数据同步端口
ETCD_LISTEN_PEER_URLS="http://192.168.26.91:2380,http://localhost:2380" 
# 读写端口
ETCD_LISTEN_CLIENT_URLS="http://192.168.26.91:2379,http://localhost:2379" 
# etcd名字
ETCD_NAME="default" 
#[Clustering]
ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379"

设置开机自启

┌──[root@liruilongs.github.io]-[~]
└─$ systemctl enable etcd --now
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl member list #查看数据源  
8e9e05c52164694d: name=default peerURLs=http://localhost:2380 clientURLs=http://localhost:2379 isLeader=true
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl cluster-health #查看数据量
member 8e9e05c52164694d is healthy: got healthy result from http://localhost:2379
cluster is healthy

isLeader=true 是否为leader

基本命令测试

┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl ls /
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl mkdir cka
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl ls /
/cka
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl rmdir /cka
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl ls /
┌──[root@liruilongs.github.io]-[~]
└─$

2和3版本切换

┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl -v
etcdctl version: 3.3.11
API version: 2
┌──[root@liruilongs.github.io]-[~]
└─$ export ETCDCTL_API=3
┌──[root@liruilongs.github.io]-[~]
└─$ etcdctl version
etcdctl version: 3.3.11
API version: 3.3
┌──[root@liruilongs.github.io]-[~]
└─$

三、etcd集群构建

  • ETCD集群是一个分布式系统,使用Raft协议来维护集群内各个节点状态的一致性。
  • 主机状态 Leader, Follower, Candidate
  • 当集群初始化时候,每个节点都是Follower角色,通过心跳与其他节点同步数据
  • 通过Follower读取数据,通过Leader写入数据
  • Follower在一定时间内没有收到来自主节点的心跳,会将自己角色改变为Candidate,并发起一次选主投票
  • 配置etcd集群,建议尽可能是奇数个节点,而不要偶数个节点,推荐的数量为 3、5 或者 7 个节点构成一个集群。

1、创建集群

这里使用ansible的方式实现

vms81.liruilongs.github.io 控制节点
192.168.26.100 etcd节点 受控节点
192.168.26.102 etcd节点 受控节点
192.168.26.101 etcd节点 受控节点

环境准备,主机清单编写

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cat inventory
......
[etcd]
192.168.26.100
192.168.26.101
192.168.26.102

受控节点ping测试

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m ping
192.168.26.100 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "ping": "pong"
}
192.168.26.102 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "ping": "pong"
}
192.168.26.101 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "ping": "pong"
}
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m yum -a "name=etcd state=installed"

配置文件修改

这里用前两台(192.168.26.100,192.168.26.101)初始化集群,第三台(192.168.26.102 )以添加的方式加入集群

本机编写配置文件。

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cat etcd.conf
ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"

ETCD_LISTEN_PEER_URLS="http://192.168.26.100:2380,http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.26.100:2379,http://localhost:2379"

ETCD_NAME="etcd-100"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.100:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.100:2379"

ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
参数的意义
ETCD_NAME 节点名称,默认为defaul
ETCD_DATA_DIR 服务运行数据保存的路
ETCD_LISTEN_PEER_URLS 监听的同伴通信的地址,比如http://ip:2380,如果有多个,使用逗号分隔。需要所有节点都能够访问,所以不要使用 localhost
ETCD_LISTEN_CLIENT_URLS 监听的客户端服务地
ETCD_ADVERTISE_CLIENT_URLS 对外公告的该节点客户端监听地址,这个值会告诉集群中其他节点
ETCD_INITIAL_ADVERTISE_PEER_URLS 对外公告的该节点同伴监听地址,这个值会告诉集群中其他节
ETCD_INITIAL_CLUSTER 集群中所有节点的信息,格式
ETCD_INITIAL_CLUSTER_STATE 新建集群的时候,这个值为 new;假如加入已经存在的集群,这个值为existing
ETCD_INITIAL_CLUSTER_TOKEN 集群的ID,多个集群的时候,每个集群的ID必须保持唯一

把配置文件拷贝到192.168.26.100,192.168.26.101

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100,192.168.26.101 -m copy -a "src=./etcd.conf dest=/etc/etcd/etcd.conf force=yes"
192.168.26.101 | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": true,
    "checksum": "bae3b8bc6636bf7304cce647b7068aa45ced859b",
    "dest": "/etc/etcd/etcd.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "5f2a3fbe27515f85b7f9ed42a206c2a6",
    "mode": "0644",
    "owner": "root",
    "size": 533,
    "src": "/root/.ansible/tmp/ansible-tmp-1633800905.88-59602-39965601417441/source",
    "state": "file",
    "uid": 0
}
192.168.26.100 | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": true,
    "checksum": "bae3b8bc6636bf7304cce647b7068aa45ced859b",
    "dest": "/etc/etcd/etcd.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "5f2a3fbe27515f85b7f9ed42a206c2a6",
    "mode": "0644",
    "owner": "root",
    "size": 533,
    "src": "/root/.ansible/tmp/ansible-tmp-1633800905.9-59600-209338664801782/source",
    "state": "file",
    "uid": 0
}

检查配置文件

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100,192.168.26.101 -m shell -a "cat /etc/etcd/etcd.conf"
192.168.26.101 | CHANGED | rc=0 >>
ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"

ETCD_LISTEN_PEER_URLS="http://192.168.26.100:2380,http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.26.100:2379,http://localhost:2379"

ETCD_NAME="etcd-100"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.100:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.100:2379"

ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
192.168.26.100 | CHANGED | rc=0 >>
ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"

ETCD_LISTEN_PEER_URLS="http://192.168.26.100:2380,http://localhost:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.26.100:2379,http://localhost:2379"

ETCD_NAME="etcd-100"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.100:2380"

ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.100:2379"

ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

修改101的配置文件

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.101  -m shell -a "sed -i  '1,9s/100/101/g' /etc/etcd/etcd.conf"
192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100,192.168.26.101 -m shell -a "cat -n /etc/etcd/etcd.conf"
192.168.26.100 | CHANGED | rc=0 >>
     1  ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"
     2
     3  ETCD_LISTEN_PEER_URLS="http://192.168.26.100:2380,http://localhost:2380"
     4  ETCD_LISTEN_CLIENT_URLS="http://192.168.26.100:2379,http://localhost:2379"
     5
     6  ETCD_NAME="etcd-100"
     7  ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.100:2380"
     8
     9  ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.100:2379"
    10
    11  ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
    12  ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
    13  ETCD_INITIAL_CLUSTER_STATE="new"
192.168.26.101 | CHANGED | rc=0 >>
     1  ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"
     2
     3  ETCD_LISTEN_PEER_URLS="http://192.168.26.101:2380,http://localhost:2380"
     4  ETCD_LISTEN_CLIENT_URLS="http://192.168.26.101:2379,http://localhost:2379"
     5
     6  ETCD_NAME="etcd-101"
     7  ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.101:2380"
     8
     9  ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.101:2379"
    10
    11  ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
    12  ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
    13  ETCD_INITIAL_CLUSTER_STATE="new"
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

查看etcd集群

192.168.26.101 为 Leader

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100,192.168.26.101 -m shell -a "etcdctl member list"
192.168.26.100 | CHANGED | rc=0 >>
6f2038a018db1103: name=etcd-100 peerURLs=http://192.168.26.100:2380 clientURLs=http://192.168.26.100:2379,http://localhost:2379 isLeader=false
bd330576bb637f25: name=etcd-101 peerURLs=http://192.168.26.101:2380 clientURLs=http://192.168.26.101:2379,http://localhost:2379 isLeader=true
192.168.26.101 | CHANGED | rc=0 >>
6f2038a018db1103: name=etcd-100 peerURLs=http://192.168.26.100:2380 clientURLs=http://192.168.26.100:2379,http://localhost:2379 isLeader=false
bd330576bb637f25: name=etcd-101 peerURLs=http://192.168.26.101:2380 clientURLs=http://192.168.26.101:2379,http://localhost:2379 isLeader=true
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

添加etcd 192.168.26.102到集群

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100 -m shell -a "etcdctl member add etcd-102 http://192.168.26.102:2380"
192.168.26.100 | CHANGED | rc=0 >>
Added member named etcd-102 with ID 2fd4f9ba70a04579 to cluster

ETCD_NAME="etcd-102"
ETCD_INITIAL_CLUSTER="etcd-102=http://192.168.26.102:2380,etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

修改之前写好的配置文件给102

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$sed -i '1,8s/100/102/g' etcd.conf
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$sed -i '13s/new/existing/'  etcd.conf
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$sed -i 's#ETCD_INITIAL_CLUSTER="#ETCD_INITIAL_CLUSTER="etcd-102=http://192.168.26.102:2380,#' etcd.conf
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cat -n etcd.conf
     1  ETCD_DATA_DIR="/var/lib/etcd/cluster.etcd"
     2
     3  ETCD_LISTEN_PEER_URLS="http://192.168.26.102:2380,http://localhost:2380"
     4  ETCD_LISTEN_CLIENT_URLS="http://192.168.26.102:2379,http://localhost:2379"
     5
     6  ETCD_NAME="etcd-102"
     7  ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.102:2380"
     8
     9  ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379,http://192.168.26.102:2379"
    10
    11  ETCD_INITIAL_CLUSTER="etcd-102=http://192.168.26.102:2380,etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380"
    12  ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
    13  ETCD_INITIAL_CLUSTER_STATE="existing"
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

配置文件拷贝替换,启动etcd

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.102 -m copy -a "src=./etcd.conf dest=/etc/etcd/etcd.conf force=yes"
192.168.26.102 | CHANGED => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": true,
    "checksum": "2d8fa163150e32da563f5e591134b38cc356d237",
    "dest": "/etc/etcd/etcd.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "389c2850d434478e2d4d57a7798196de",
    "mode": "0644",
    "owner": "root",
    "size": 574,
    "src": "/root/.ansible/tmp/ansible-tmp-1633803533.57-102177-227527368141930/source",
    "state": "file",
    "uid": 0
}
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.102 -m shell -a "systemctl enable etcd --now"
192.168.26.102 | CHANGED | rc=0 >>
Created symlink from /etc/systemd/system/multi-user.target.wants/etcd.service to /usr/lib/systemd/system/etcd.service.
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

检查集群是否添加成功

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "etcdctl member list"
192.168.26.101 | CHANGED | rc=0 >>
2fd4f9ba70a04579: name=etcd-102 peerURLs=http://192.168.26.102:2380 clientURLs=http://192.168.26.102:2379,http://localhost:2379 isLeader=false
6f2038a018db1103: name=etcd-100 peerURLs=http://192.168.26.100:2380 clientURLs=http://192.168.26.100:2379,http://localhost:2379 isLeader=false
bd330576bb637f25: name=etcd-101 peerURLs=http://192.168.26.101:2380 clientURLs=http://192.168.26.101:2379,http://localhost:2379 isLeader=true
192.168.26.102 | CHANGED | rc=0 >>
2fd4f9ba70a04579: name=etcd-102 peerURLs=http://192.168.26.102:2380 clientURLs=http://192.168.26.102:2379,http://localhost:2379 isLeader=false
6f2038a018db1103: name=etcd-100 peerURLs=http://192.168.26.100:2380 clientURLs=http://192.168.26.100:2379,http://localhost:2379 isLeader=false
bd330576bb637f25: name=etcd-101 peerURLs=http://192.168.26.101:2380 clientURLs=http://192.168.26.101:2379,http://localhost:2379 isLeader=true
192.168.26.100 | CHANGED | rc=0 >>
2fd4f9ba70a04579: name=etcd-102 peerURLs=http://192.168.26.102:2380 clientURLs=http://192.168.26.102:2379,http://localhost:2379 isLeader=false
6f2038a018db1103: name=etcd-100 peerURLs=http://192.168.26.100:2380 clientURLs=http://192.168.26.100:2379,http://localhost:2379 isLeader=false
bd330576bb637f25: name=etcd-101 peerURLs=http://192.168.26.101:2380 clientURLs=http://192.168.26.101:2379,http://localhost:2379 isLeader=true
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

设置环境变量,切换版本,这里有一点麻烦。

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "echo 'export ETCDCTL_API=3' >> ~/.bashrc"
192.168.26.100 | CHANGED | rc=0 >>

192.168.26.102 | CHANGED | rc=0 >>

192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "cat ~/.bashrc"
192.168.26.100 | CHANGED | rc=0 >>
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
export ETCDCTL_API=3
192.168.26.102 | CHANGED | rc=0 >>
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
export ETCDCTL_API=3
192.168.26.101 | CHANGED | rc=0 >>
# .bashrc

# User specific aliases and functions

alias rm='rm -i'
alias cp='cp -i'
alias mv='mv -i'

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi
export ETCDCTL_API=3
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "etcdctl version"
192.168.26.100 | CHANGED | rc=0 >>
etcdctl version: 3.3.11
API version: 3.3
192.168.26.102 | CHANGED | rc=0 >>
etcdctl version: 3.3.11
API version: 3.3
192.168.26.101 | CHANGED | rc=0 >>
etcdctl version: 3.3.11
API version: 3.3
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

同步性测试

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$# 同步性测试
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100 -a  "etcdctl put name liruilong"
192.168.26.100 | CHANGED | rc=0 >>
OK
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "etcdctl get name"
192.168.26.100 | CHANGED | rc=0 >>
name
liruilong
192.168.26.101 | CHANGED | rc=0 >>
name
liruilong
192.168.26.102 | CHANGED | rc=0 >>
name
liruilong
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

四、etcd集群备份,恢复

准备数据

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100 -a  "etcdctl put name liruilong"
192.168.26.100 | CHANGED | rc=0 >>
OK
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "etcdctl get name"
192.168.26.102 | CHANGED | rc=0 >>
name
liruilong
192.168.26.100 | CHANGED | rc=0 >>
name
liruilong
192.168.26.101 | CHANGED | rc=0 >>
name
liruilong

在任何一台主机上对 etcd 做快照

#在任何一台主机上对 etcd 做快照
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.101 -a "etcdctl snapshot save snap20211010.db"
192.168.26.101 | CHANGED | rc=0 >>
Snapshot saved at snap20211010.db
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$# 此快照里包含了刚刚写的数据 name=liruilong,然后把快照文件到所有节点
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.101 -a "scp /root/snap20211010.db root@192.168.26.100:/root/"
192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.101 -a "scp /root/snap20211010.db root@192.168.26.102:/root/"
192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

清空数据

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "etcdctl del name"
192.168.26.101 | CHANGED | rc=0 >>
1
192.168.26.102 | CHANGED | rc=0 >>
0
192.168.26.100 | CHANGED | rc=0 >>
0
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

在所有节点上关闭 etcd,并删除/var/lib/etcd/里所有数据:

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$# 在所有节点上关闭 etcd,并删除/var/lib/etcd/里所有数据:
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "systemctl stop etcd"
192.168.26.100 | CHANGED | rc=0 >>

192.168.26.102 | CHANGED | rc=0 >>

192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "rm -rf /var/lib/etcd/*"
[WARNING]: Consider using the file module with state=absent rather than running 'rm'.  If you need to
use command because file is insufficient you can add 'warn: false' to this command task or set
'command_warnings=False' in ansible.cfg to get rid of this message.
192.168.26.102 | CHANGED | rc=0 >>

192.168.26.100 | CHANGED | rc=0 >>

192.168.26.101 | CHANGED | rc=0 >>

在所有节点上把快照文件的所有者和所属组设置为 etcd:

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "chown etcd.etcd /root/snap20211010.db"
[WARNING]: Consider using the file module with owner rather than running 'chown'.  If you need to use
command because file is insufficient you can add 'warn: false' to this command task or set
'command_warnings=False' in ansible.cfg to get rid of this message.
192.168.26.100 | CHANGED | rc=0 >>

192.168.26.102 | CHANGED | rc=0 >>

192.168.26.101 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$# 在每台节点上开始恢复数据:

在每台节点上开始恢复数据:

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.100 -m script  -a "./snapshot_restore.sh"
192.168.26.100 | CHANGED => {
    "changed": true,
    "rc": 0,
    "stderr": "Shared connection to 192.168.26.100 closed.\r\n",
    "stderr_lines": [
        "Shared connection to 192.168.26.100 closed."
    ],
    "stdout": "2021-10-10 12:14:30.726021 I | etcdserver/membership: added member 6f2038a018db1103 [http://192.168.26.100:2380] to cluster af623437f584d792\r\n2021-10-10 12:14:30.726234 I | etcdserver/membership: added member bd330576bb637f25 [http://192.168.26.101:2380] to cluster af623437f584d792\r\n",
    "stdout_lines": [
        "2021-10-10 12:14:30.726021 I | etcdserver/membership: added member 6f2038a018db1103 [http://192.168.26.100:2380] to cluster af623437f584d792",
        "2021-10-10 12:14:30.726234 I | etcdserver/membership: added member bd330576bb637f25 [http://192.168.26.101:2380] to cluster af623437f584d792"
    ]
}
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cat -n ./snapshot_restore.sh
     1  #!/bin/bash
     2
     3  # 每台节点恢复镜像
     4
     5  etcdctl snapshot restore /root/snap20211010.db \
     6  --name etcd-100 \
     7  --initial-advertise-peer-urls="http://192.168.26.100:2380" \
     8  --initial-cluster="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380" \
     9  --data-dir="/var/lib/etcd/cluster.etcd"
    10
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$sed '6,7s/100/101/g' ./snapshot_restore.sh
#!/bin/bash

# 每台节点恢复镜像

etcdctl snapshot restore /root/snap20211010.db \
--name etcd-101 \
--initial-advertise-peer-urls="http://192.168.26.101:2380" \
--initial-cluster="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380" \
--data-dir="/var/lib/etcd/cluster.etcd"

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$sed -i '6,7s/100/101/g' ./snapshot_restore.sh
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$cat ./snapshot_restore.sh
#!/bin/bash

# 每台节点恢复镜像

etcdctl snapshot restore /root/snap20211010.db \
--name etcd-101 \
--initial-advertise-peer-urls="http://192.168.26.101:2380" \
--initial-cluster="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380" \
--data-dir="/var/lib/etcd/cluster.etcd"

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.101 -m script  -a "./snapshot_restore.sh"
192.168.26.101 | CHANGED => {
    "changed": true,
    "rc": 0,
    "stderr": "Shared connection to 192.168.26.101 closed.\r\n",
    "stderr_lines": [
        "Shared connection to 192.168.26.101 closed."
    ],
    "stdout": "2021-10-10 12:20:26.032754 I | etcdserver/membership: added member 6f2038a018db1103 [http://192.168.26.100:2380] to cluster af623437f584d792\r\n2021-10-10 12:20:26.032930 I | etcdserver/membership: added member bd330576bb637f25 [http://192.168.26.101:2380] to cluster af623437f584d792\r\n",
    "stdout_lines": [
        "2021-10-10 12:20:26.032754 I | etcdserver/membership: added member 6f2038a018db1103 [http://192.168.26.100:2380] to cluster af623437f584d792",
        "2021-10-10 12:20:26.032930 I | etcdserver/membership: added member bd330576bb637f25 [http://192.168.26.101:2380] to cluster af623437f584d792"
    ]
}
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

所有节点把/var/lib/etcd 及里面内容的所有者和所属组改为 etcd:v然后分别启动 etcd

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "chown -R etcd.etcd /var/lib/etcd/"
[WARNING]: Consider using the file module with owner rather than running 'chown'.  If you need to use
command because file is insufficient you can add 'warn: false' to this command task or set
'command_warnings=False' in ansible.cfg to get rid of this message.
192.168.26.100 | CHANGED | rc=0 >>

192.168.26.101 | CHANGED | rc=0 >>

192.168.26.102 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "systemctl start etcd"
192.168.26.102 | FAILED | rc=1 >>
Job for etcd.service failed because the control process exited with error code. See "systemctl status etcd.service" and "journalctl -xe" for details.non-zero return code
192.168.26.101 | CHANGED | rc=0 >>

192.168.26.100 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

把剩下的节点添加进集群

# etcdctl member add etcd_name –peer-urls=”https://peerURLs”
[root@vms100 cluster.etcd]# etcdctl member add etcd-102 --peer-urls="http://192.168.26.102:2380"
Member fbd8a96cbf1c004d added to cluster af623437f584d792

ETCD_NAME="etcd-102"
ETCD_INITIAL_CLUSTER="etcd-100=http://192.168.26.100:2380,etcd-101=http://192.168.26.101:2380,etcd-102=http://192.168.26.102:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.26.102:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
[root@vms100 cluster.etcd]#

测试恢复结果

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.102 -m copy -a "src=./etcd.conf dest=/etc/etcd/etcd.conf force=yes"
192.168.26.102 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "checksum": "2d8fa163150e32da563f5e591134b38cc356d237",
    "dest": "/etc/etcd/etcd.conf",
    "gid": 0,
    "group": "root",
    "mode": "0644",
    "owner": "root",
    "path": "/etc/etcd/etcd.conf",
    "size": 574,
    "state": "file",
    "uid": 0
}
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible 192.168.26.102 -m shell -a "systemctl enable etcd --now"
192.168.26.102 | CHANGED | rc=0 >>

┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -m shell -a "etcdctl member list"
192.168.26.101 | CHANGED | rc=0 >>
6f2038a018db1103, started, etcd-100, http://192.168.26.100:2380, http://192.168.26.100:2379,http://localhost:2379
bd330576bb637f25, started, etcd-101, http://192.168.26.101:2380, http://192.168.26.101:2379,http://localhost:2379
fbd8a96cbf1c004d, started, etcd-102, http://192.168.26.102:2380, http://192.168.26.102:2379,http://localhost:2379
192.168.26.100 | CHANGED | rc=0 >>
6f2038a018db1103, started, etcd-100, http://192.168.26.100:2380, http://192.168.26.100:2379,http://localhost:2379
bd330576bb637f25, started, etcd-101, http://192.168.26.101:2380, http://192.168.26.101:2379,http://localhost:2379
fbd8a96cbf1c004d, started, etcd-102, http://192.168.26.102:2380, http://192.168.26.102:2379,http://localhost:2379
192.168.26.102 | CHANGED | rc=0 >>
6f2038a018db1103, started, etcd-100, http://192.168.26.100:2380, http://192.168.26.100:2379,http://localhost:2379
bd330576bb637f25, started, etcd-101, http://192.168.26.101:2380, http://192.168.26.101:2379,http://localhost:2379
fbd8a96cbf1c004d, started, etcd-102, http://192.168.26.102:2380, http://192.168.26.102:2379,http://localhost:2379
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$ansible etcd -a "etcdctl get name"
192.168.26.102 | CHANGED | rc=0 >>
name
liruilong
192.168.26.101 | CHANGED | rc=0 >>
name
liruilong
192.168.26.100 | CHANGED | rc=0 >>
name
liruilong
┌──[root@vms81.liruilongs.github.io]-[~/ansible]
└─$

k8s中etcd以pod的方式设置

┌──[root@vms81.liruilongs.github.io]-[~]
└─$kubectl  get pods
NAME                                                 READY   STATUS        RESTARTS   AGE
calico-kube-controllers-78d6f96c7b-79xx4             1/1     Running       57         3d1h
calico-node-ntm7v                                    1/1     Running       16         3d11h
calico-node-skzjp                                    0/1     Running       42         3d11h
calico-node-v7pj5                                    1/1     Running       9          3d11h
coredns-545d6fc579-9h2z4                             1/1     Running       9          3d1h
coredns-545d6fc579-xgn8x                             1/1     Running       10         3d1h
etcd-vms81.liruilongs.github.io                      1/1     Running       8          3d11h
kube-apiserver-vms81.liruilongs.github.io            1/1     Running       20         3d11h
kube-controller-manager-vms81.liruilongs.github.io   1/1     Running       26         3d11h
kube-proxy-rbhgf                                     1/1     Running       4          3d11h
kube-proxy-vm2sf                                     1/1     Running       3          3d11h
kube-proxy-zzbh9                                     1/1     Running       2          3d11h
kube-scheduler-vms81.liruilongs.github.io            1/1     Running       24         3d11h
metrics-server-bcfb98c76-6q5mb                       1/1     Terminating   0          43h
metrics-server-bcfb98c76-9ptf4                       1/1     Terminating   0          27h
metrics-server-bcfb98c76-bbr6n                       0/1     Pending       0          12h
┌──[root@vms81.liruilongs.github.io]-[~]
└─$
上一篇:如何优雅的实现异常块


下一篇:WCF方法拦截及OperationInvoker传递参数到WCF方法的实现