概述
事情是这样的,我在树梅派上运行着docker swarm的集群原来的ip是192.168.0.113,之后因为搬家,然后ip变成了192.168.11.113,接着的事情大家肯定可以想出来,就是node和manager肯定连接不上了。所以我直接把node移出集群,然后重新加入,但是那时候的join token的命令是下面这样子的
╰─ docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0shhd0b7uwajhgymxgpp1nv5u17jvcup9vvmhnqkg77ds57e5h-57e7hvjaxaagxtxddz416q5z2 192.168.0.113:2377
当时我就直接把最后的ip修改为192.168.11.113这样把node加入集群,没错,的确可以加入,但是不知道为什么,node还是往192.168.0.113去连接,导致报错
╰─ tail -f daemon.log 1 ↵
Jul 4 08:07:16 pi-slave dockerd[21221]: time="2018-07-04T08:07:16.316088897Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:16 pi-slave dockerd[21221]: time="2018-07-04T08:07:16.317349637Z" level=info msg="Failed to dial 192.168.0.113:2377: grpc: the connection is closing; please retry." module=grpc
Jul 4 08:07:19 pi-slave dockerd[21221]: time="2018-07-04T08:07:19.316151478Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.318750223Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.319398354Z" level=error msg="agent: session failed" backoff=8s error="rpc error: code = Unavailable desc = grpc: the connection is unavailable" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
Jul 4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.319955650Z" level=info msg="manager selected by agent for new session: {h6i2x0hals6za31uya16zlmli 192.168.0.113:2377}" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
Jul 4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.320147058Z" level=info msg="waiting 7.379239722s before registering session" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
Jul 4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.315780534Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.318872383Z" level=info msg="Failed to dial 192.168.0.113:2377: grpc: the connection is closing; please retry." module=grpc
Jul 4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.318573109Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:28 pi-slave dockerd[21221]: time="2018-07-04T08:07:28.316269050Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.331695034Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
Jul 4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.332270925Z" level=error msg="agent: session failed" backoff=8s error="rpc error: code = Unavailable desc = grpc: the connection is unavailable" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
Jul 4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.332991816Z" level=info msg="manager selected by agent for new session: {h6i2x0hals6za31uya16zlmli 192.168.0.113:2377}" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
Jul 4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.333203172Z" level=info msg="waiting 1.698796755s before registering session" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
之后在qq群 facebook里面问了下,还是没有人可以给出一个肯定的解决方式,重新生成join token也不行
╭─root@pi-master /etc/default
╰─ docker swarm join-token --rotate worker
Successfully rotated worker join token.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0shhd0b7uwajhgymxgpp1nv5u17jvcup9vvmhnqkg77ds57e5h-34adsf2isqqqnn7gd5hnhumdh 192.168.0.113:2377
生成出来的ip还是192.168.0.113,那么就没办法了,因为是测试环境,只能暴力解决了重新建立集群
重新建立集群
首先把worker和manager都直接leave集群了,首先leave node
docker swarm leave --force
之后是manager
docker swarm leave --force
因为我的机器上没有什么重要的容器,所以直接这么做了
接着重新建立集群
╭─root@pi-master /etc/default
╰─ docker swarm init
Swarm initialized: current node (wylbqh0q0p0yromn6hh44jrvx) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-0n8n2vbtnuksw78opuex5hhtgffzn36dbii7u1dp5rrul8z85p-3buc8f7h2n8st570hcx9lwthk 192.168.11.113:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
可以看到上面的ip变为192.168.11.113了
把node加入集群
docker swarm join --token SWMTKN-1-0n8n2vbtnuksw78opuex5hhtgffzn36dbii7u1dp5rrul8z85p-3buc8f7h2n8st570hcx9lwthk 192.168.11.113:2377
最后看下docker的日志
╭─root@pi-slave /var/log
╰─ tail -f daemon.log
Jul 4 08:34:35 pi-slave dhcpcd[517]: docker_gwbridge: using IPv4LL address 169.254.160.155
Jul 4 08:34:35 pi-slave avahi-daemon[306]: Registering new address record for 169.254.160.155 on docker_gwbridge.IPv4.
Jul 4 08:34:35 pi-slave dhcpcd[517]: docker_gwbridge: adding route to 169.254.0.0/16
Jul 4 08:34:35 pi-slave dhcpcd[517]: veth06a7c80: using IPv4LL address 169.254.142.194
Jul 4 08:34:35 pi-slave avahi-daemon[306]: Joining mDNS multicast group on interface veth06a7c80.IPv4 with address 169.254.142.194.
Jul 4 08:34:35 pi-slave avahi-daemon[306]: New relevant interface veth06a7c80.IPv4 for mDNS.
Jul 4 08:34:35 pi-slave avahi-daemon[306]: Registering new address record for 169.254.142.194 on veth06a7c80.IPv4.
Jul 4 08:34:35 pi-slave dhcpcd[517]: veth06a7c80: adding route to 169.254.0.0/16
Jul 4 08:34:38 pi-slave dhcpcd[517]: veth06a7c80: no IPv6 Routers available
Jul 4 08:34:38 pi-slave dhcpcd[517]: docker_gwbridge: no IPv6 Routers available
ok,一切正常
要值得注意的是我上面这么做之后所有的容器就都没有了,几乎就是重新安装了一次集群,所以肯定不是最佳的解决方式,如果有人知道怎么处理这个问题欢迎联系我可以告诉我一下
接着就可以创建容器了
docker network create --driver=overlay --subnet=192.168.12.1/24 visualizer
值得注意的是我现在才发现,当你在manager里面创建了这个overlay网络之后如果你不做在这个网络创建容器或者什么其他可以影响到worker节点的操作的时候,那么你就是不能在worker节点上看到这个网络的
创建容器
docker service create --name visualizer --replicas=3 --publish=8088:8080 --network=visualizer --mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock alexellis2/visualizer-arm
欢迎关注Bboysoul的博客www.bboysoul.com
Have Fun