即使有了Docker Compose,项目的部署仍然存在问题,因为Docker Compose只能把项目所有的容器部署在同一台机器上,这在生产环境下是不现实的。
Docker Compose一般只适用于开发环境,而对于生产环境下的项目部署,我们需要用到Docker Swarm。
Docker Swarm介绍
Docker Swarm是Docker官方提供的一套容器编排系统,它将一组Docker主机虚拟成一个单独的虚拟Docker主机。
架构如下:
- swarm节点:
swarm是一系列节点的集合,而节点可以是一台裸机或者一台虚拟机。一个节点能扮演一个或者两个角色,manager或者worker。
manager Docker Swarm集群需要至少一个manager节点,节点之间使用Raft consensus protocol进行协同工作。 通常,第一个启用docker swarm的节点将成为leader,后来加入的都是follower。当前的leader如果挂掉, 剩余的节点将重新选举出一个新的leader。每一个manager都有一个完整的当前集群状态的副本,可以保证manager的高可用。 worker worker节点是运行实际应用服务的容器所在的地方。理论上,一个manager节点也能同时成为worker节点,但在生产环境中, 我们不建议这样做。worker节点之间,通过control plane进行通信,这种通信使用gossip协议,并且是异步的。
- task、service、stack:
多个tasks组成一个service,多个services组成一个stack。
task 在Docker Swarm中,task是一个部署的最小单元,task与容器是一对一的关系。 service swarm service是一个抽象的概念,它只是一个对运行在swarm集群上的应用服务,所期望状态的描述。 它就像一个描述了下面物品的清单列表一样: 服务名称 使用哪个镜像来创建容器 要运行多少个副本 服务的容器要连接到哪个网络上 需要映射哪些端口 stack stack是描述一系列相关services的集合,可以通过在一个YAML文件中来定义一个stack,类似于docker-compose。
- 多主机网络:
对于单主机网络,所有的容器都运行在一个docker主机上,他们之间的通信一般使用本地的bridge network即可。
而对于swarm集群,针对的是一组docker主机,需要使用docker的overlay network。
常用命令
-
docker swarm
命令:
ca 显示根CA证书 init 初始化一个集群 join 加入集群 join-token worker 查看工作节点的token join-token manager 查看管理节点的token leave 离开集群 unlock 解锁集群 unlock-key 管理解锁密钥 update 更新集群
-
docker node
命令:
demote 节点降级,由管理节点降级为工作节点 inspect 查看一个或多个节点的详情 ls 查看所有的节点 promote 节点升级,由工作节点升级为管理节点 ps 查看一个或多个节点中的task,默认为当前节点 rm 删除一个或多个节点 update 更新一个节点
-
docker service
命令:
create 创建一个新的service inspect 查看一个或多个service的详情 logs 获取service或task的日志 ls 列出所有的service ps 列出一个或多个service的task rm 删除一个或多个service rollback 将更改还原为service的配置 scale 创建一个或多个service的副本 update 更新一个service
-
docker stack
命令:
deploy 部署新的stack或更新现有stack ls 列出现有stack ps 列出stack中的tasks rm 删除一个或多个stack services 列出stack中的services
注意:以上命令大多只能在manager节点上执行。
Swarm集群创建
根据集群的高可用性要求实现奇数个节点。当有两个以上manager节点时,集群可以manager节点的故障中恢复,而无需停机。
N个manager节点的集群将最多容忍(N-1) / 2
个manager节点的丢失。
下面创建一个三节点的swarm集群。
- 角色划分:
role | ip | hostname |
---|---|---|
manager | 192.168.30.128 | test1 |
worker1 | 192.168.30.129 | test2 |
worker2 | 192.168.30.130 | test3 |
- 环境准备:
# systemctl stop firewalld && systemctl disable firewalld# sed -i 's/=enforcing/=disabled/g' /etc/selinux/config && setenforce 0
- 安装docker:
# curl http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -o /etc/yum.repos.d/docker.repo# yum makecache fast# yum install -y docker-ce# systemctl start docker && systemctl enable docker
- 创建集群:
注意:swarm集群必须先在manager节点进行集群初始化,然后在worker节点上加入集群。
192.168.30.128
# docker swarm init --advertise-addr=192.168.30.128Swarm initialized: current node (q1n9ztahdj489pltf3gl5pomj) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-38ci507e4fyp92oauqvov5axo3qhti5wlm58odjz5lo2rdatyo-e2y52gxq7y40mah2nzq1b5fg9 192.168.30.128:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
将添加worker节点的命令复制到其它非manager节点的机器执行
192.168.30.129
# docker swarm join --token SWMTKN-1-38ci507e4fyp92oauqvov5axo3qhti5wlm58odjz5lo2rdatyo-e2y52gxq7y40mah2nzq1b5fg9 192.168.30.128:2377This node joined a swarm as a worker.
192.168.30.130
# docker swarm join --token SWMTKN-1-38ci507e4fyp92oauqvov5axo3qhti5wlm58odjz5lo2rdatyo-e2y52gxq7y40mah2nzq1b5fg9 192.168.30.128:2377This node joined a swarm as a worker.
- 查看集群节点:
192.168.30.128
# docker node lsID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION q1n9ztahdj489pltf3gl5pomj * test1 Ready Active Leader 19.03.3 0qyp2ut4m3pggag1yq7f3jn31 test2 Ready Active 19.03.4 cbi8detm7t9v8w5ntyzid0cvj test3 Ready Active 19.03.4
可以看到,一个三节点的swarm集群创建完毕,test1(192.168.30.128)为manager节点。
Docker Service
在Docker Swarm集群中,我们可以通过docker service
命令创建一些service,每个service都包含多个tasks,每个task对应一个容器。
- 创建service:
192.168.30.128
# docker service create --name busybox busybox:latest sh -c "while true; do sleep 3600; done"y8o6jogs0iyp4qewb5okgzb37 overall progress: 1 out of 1 tasks 1/1: running verify: Service converged# docker service lsID NAME MODE REPLICAS IMAGE PORTS y8o6jogs0iyp busybox replicated 1/1 busybox:latest# docker service ps busybox ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS mavy3blpmzvz busybox.1 busybox:latest test2 Running Running about a minute ago
可以看到,该service的task运行在test2(192.168.30.129)上
192.168.30.129
# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3dd48d9541ca busybox:latest "sh -c 'while true; …" 2 minutes ago Up 2 minutes busybox.1.mavy3blpmzvzka1ks1ebuz3s4
- 水平扩展:
当前busybox这个service的task只有1个,扩展为5个。
192.168.30.128
# docker service scale busybox=5busybox scaled to 5 overall progress: 5 out of 5 tasks 1/5: running 2/5: running 3/5: running 4/5: running 5/5: running verify: Service converged# docker service lsID NAME MODE REPLICAS IMAGE PORTS y8o6jogs0iyp busybox replicated 5/5 busybox:latest# docker service ps busybox ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS mavy3blpmzvz busybox.1 busybox:latest test2 Running Running 4 minutes ago gxg5gt2j5a1v busybox.2 busybox:latest test1 Running Running 20 seconds ago okge105yuzb8 busybox.3 busybox:latest test2 Running Running 25 seconds ago b86rr94bbotj busybox.4 busybox:latest test3 Running Running 22 seconds ago 8zogu5kacnpw busybox.5 busybox:latest test1 Running Running 20 seconds ago
可以看到,该service的task分别运行在集群的3个节点上。
- 故障恢复:
当某个task对应的容器挂掉时,会自动在任一节点启动该task对应的容器。
192.168.30.128
# docker rm -f 7d013a7eb6857d013a7eb685# docker service ps busybox ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS mavy3blpmzvz busybox.1 busybox:latest test2 Running Running 11 minutes ago jewllc9gywpa busybox.2 busybox:latest test3 Ready Ready 3 seconds ago gxg5gt2j5a1v \_ busybox.2 busybox:latest test1 Shutdown Failed 3 seconds ago "task: non-zero exit (137)" okge105yuzb8 busybox.3 busybox:latest test2 Running Running 7 minutes ago b86rr94bbotj busybox.4 busybox:latest test3 Running Running 7 minutes ago 8zogu5kacnpw busybox.5 busybox:latest test1 Running Running 7 minutes ago # docker service lsID NAME MODE REPLICAS IMAGE PORTS y8o6jogs0iyp busybox replicated 5/5 busybox:latest
可以看到,test1(192.168.30.128)上强制删除的容器在test3(192.168.30.130)上重新启动。
192.168.30.130
# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 05efd68e06c5 busybox:latest "sh -c 'while true; …" 2 minutes ago Up About a minute busybox.2.jewllc9gywpaude65rmc87wka 629fe2d2b396 busybox:latest "sh -c 'while true; …" 9 minutes ago Up 9 minutes busybox.4.b86rr94bbotj1fhyk7owwt2tl
故障恢复可以保证我们的service是稳定有效的。
- 删除service:
192.168.30.128
# docker service lsID NAME MODE REPLICAS IMAGE PORTS y8o6jogs0iyp busybox replicated 5/5 busybox:latest # docker service rm busybox busybox# docker service lsID NAME MODE REPLICAS IMAGE PORTS# docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
192.168.30.129
# docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
192.168.30.130
# docker ps -aCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
当删除service后,该service对应的task容器也会停止运行并删除。
Docker Service 示例
如果在swarm集群中部署项目,首先需要创建的是overlay network,因为项目中关联的services需要通过overlay network通信。
在swarm集群中创建overlay network,不再需要外部的分布式存储(如etcd),swarm集群会自动完成overlay network的同步工作。
下面使用Docker Service部署wordpress项目,该项目包含两个service:wordpress和mysql。
- 创建overlay network:
192.168.30.128
# docker network lsNETWORK ID NAME DRIVER SCOPE 5ee1b278fd34 bridge bridge local faf258f504b0 docker_gwbridge bridge local 535808221d2e host host local 5yetwtzg2b1x ingress overlay swarm 2addad8d8857 none null local# docker network create -d overlay testexyl7ksbeavt00c5ot0k66s2w# docker network lsNETWORK ID NAME DRIVER SCOPE 5ee1b278fd34 bridge bridge local faf258f504b0 docker_gwbridge bridge local 535808221d2e host host local 5yetwtzg2b1x ingress overlay swarm 2addad8d8857 none null local exyl7ksbeavt test overlay swarm
- 创建mysql service:
192.168.30.128
# docker service create --name mysql --network test -e MYSQL_ROOT_PASSWORD=123456789 -e MYSQL_DATABASE=wordpress --mount type=volume,source=mysql_data,destination=/var/lib/mysql mysql:5.73bjl5kse0letilvkx0kltnfm9 overall progress: 1 out of 1 tasks 1/1: running [==================================================>] verify: Service converged
# docker service lsID NAME MODE REPLICAS IMAGE PORTS 3bjl5kse0let mysql replicated 1/1 mysql:5.7 # docker service ps mysql ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS d8ofaptycyzs mysql.1 mysql:5.7 test3 Running Running about a minute ago
- 创建wordpress service:
192.168.30.128
# docker service create --name wordpress --network test -p 80:80 -e WORDPRESS_DB_PASSWORD=123456789 -e WORDPRESS_DB_HOST=mysql wordpressx96xdiazi4iupgvwl5oza4sx3 overall progress: 1 out of 1 tasks 1/1: running [==================================================>] verify: Service converged
# docker service lsID NAME MODE REPLICAS IMAGE PORTS 3bjl5kse0let mysql replicated 1/1 mysql:5.7 x96xdiazi4iu wordpress replicated 1/1 wordpress:latest *:80->80/tcp# docker service ps wordpress ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS 8b9g1upsll15 wordpress.1 wordpress:latest test1 Running Running 36 seconds ago
可以看到,mysql这个service的task容器运行在test3(192.168.30.130)上,而wordpress这个服务的task容器运行在test1(192.168.30.128)上。
- 查看容器:
192.168.30.128
# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e189db13cbe3 wordpress:latest "docker-entrypoint.s…" About a minute ago Up About a minute 80/tcp wordpress.1.8b9g1upsll15m8crlhkebgc64
192.168.30.130
# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 6be947ccfe36 mysql:5.7 "docker-entrypoint.s…" 3 minutes ago Up 3 minutes 3306/tcp, 33060/tcp mysql.1.d8ofaptycyzs01ckt6k2zhxd7
- 浏览器访问:
打开浏览器访问192.168.30.128
,
填写信息后直接登录,可以看看wordpress站点
通过overlay network,在swarm集群中使用docker service
部署wordpress项目成功。
Routing Mesh
- 浏览器访问:
继续使用浏览器,分别访问192.168.30.129
和192.168.30.130
,
这是为什么呢?这就是Routing Mesh的作用。如果service有绑定端口,则该service可通过任意swarm节点的相应端口访问。
- 示例:
创建whoami service
# docker service create --name whoami --network test -p 8000:8000 jwilder/whoamickgqv2okq5wdscgv0pihk2hwr overall progress: 1 out of 1 tasks 1/1: running [==================================================>] verify: Service converged# docker service scale whoami=3whoami scaled to 3 overall progress: 3 out of 3 tasks 1/3: running [==================================================>] 2/3: running [==================================================>] 3/3: running [==================================================>] verify: Service converged
# docker service lsID NAME MODE REPLICAS IMAGE PORTS 3bjl5kse0let mysql replicated 1/1 mysql:5.7 ckgqv2okq5wd whoami replicated 3/3 jwilder/whoami:latest *:8000->8000/tcp x96xdiazi4iu wordpress replicated 1/1 wordpress:latest *:80->80/tcp# docker service ps whoami ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS ugndxn4ojkw4 whoami.1 jwilder/whoami:latest test2 Running Running 2 minutes ago iv3cu975hpr4 whoami.2 jwilder/whoami:latest test1 Running Running about a minute ago qc5kp773iof7 whoami.3 jwilder/whoami:latest test3 Running Running about a minute ago
创建busybox service
# docker service create --name busybox --network test busybox sh -c "while true; do sleep 3600; done"bxy6hzvrfoxy28yyagzvkfqmf overall progress: 1 out of 1 tasks 1/1: running [==================================================>] verify: Service converged# docker service ps busybox ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS rewjvfn34qq9 busybox.1 busybox:latest test2 Running Running 23 seconds ago
192.168.30.129
# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 933d4e916cbd busybox:latest "sh -c 'while true; …" About a minute ago Up About a minute busybox.1.rewjvfn34qq9tbrn2dty4e6vf 7b28da1c9491 jwilder/whoami:latest "/app/http" 5 minutes ago Up 5 minutes 8000/tcp whoami.1.ugndxn4ojkw42u6y22rfdjlii# docker exec -it busybox.1.rewjvfn34qq9tbrn2dty4e6vf sh/ # ping whoamiPING whoami (10.0.0.17): 56 data bytes 64 bytes from 10.0.0.17: seq=0 ttl=64 time=0.192 ms 64 bytes from 10.0.0.17: seq=1 ttl=64 time=0.353 ms 64 bytes from 10.0.0.17: seq=2 ttl=64 time=0.144 ms 64 bytes from 10.0.0.17: seq=3 ttl=64 time=0.144 ms ^C --- whoami ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max = 0.144/0.208/0.353 ms
前面启动了3个whoami的task容器,为什么在busybox中ping whoami
时,始终只返回一个ip呢?
/ # nslookup whoamiServer: 127.0.0.11 Address: 127.0.0.11:53 Non-authoritative answer: *** Can't find whoami: No answer / # nslookup tasks.whoami 127.0.0.11 Server: 127.0.0.11 Address: 127.0.0.11:53 Non-authoritative answer: Name: tasks.whoami Address: 10.0.0.18 Name: tasks.whoami Address: 10.0.0.21 Name: tasks.whoami Address: 10.0.0.20 *** Can't find tasks.whoami: No answer
这回看到3个ip:10.0.0.18
、10.0.0.21
、10.0.0.20
,分别在进入各个whoami的容器内查看ip可知,这3个ip就是whoami容器的真实ip,而我们ping whoami
得到的ip10.0.0.17
是一个虚拟ip(vip)。
对于一个service来说,它的vip一般是不变的,在水平扩展时发生变化的是vip后面的task容器ip。
任选一个节点,
# curl 127.0.0.1:8000I'm ae9da507e9f7 # curl 127.0.0.1:8000 I'm 5e4782fd9dee# curl 127.0.0.1:8000I'm 7b28da1c9491 # curl 127.0.0.1:8000 I'm ae9da507e9f7# curl 127.0.0.1:8000I'm 5e4782fd9dee # curl 127.0.0.1:8000 I'm 7b28da1c9491
通过连续curl返回的内容可以看到,每次返回的结果是不同的hostname,并且是轮询返回,这就形成了负载均衡。
其实routing mesh内部根据LVS(Linux Virtual Server)来实现的,通过vip达到负载均衡的目的。
- routing mesh的两种体现:
Internal 容器和容器之间的访问通过overlay网络(通过VIP) Ingress 如果service有绑定端口,则此service可通过任意swarm节点的相应端口访问
- routing mesh的作用:
外部访问的负载均衡 服务端口被暴露到各个swarm节点 内部通过LVS进行负载均衡