MySQL集群在断网后再启动报"Unable to start missing node group"问题处理

2023-08-16 11:50:52

总所周知，MySQL集群又名ndb cluster，而ndb就是network based database的简称，数据库节点之间依靠网络来通信和保证数据分块间的一致性。今天由于机房交换机损坏，导致集群4个数据节点(复制数为2)应用全部关闭。网络恢复后再启动遇到以下问题:

"2016-11-03 16:37:40 [ndbd] INFO -- Unable to start missing node group! starting: 0000000000000002 (missing fs for: 0000000000000000)

2016-11-03 16:37:40 [ndbd] INFO -- QMGR (Line: 1872) 0x00000002

2016-11-03 16:37:40 [ndbd] INFO -- Error handler shutting down system

2016-11-03 16:37:40 [ndbd] INFO -- Error handler shutdown completed - exiting

2016-11-03 16:37:41 [ndbd] ALERT -- Node 1: Forced node shutdown completed. Occured during startphase 1. Caused by error 2353: 'Insufficent nodes for system restart(Restart error). Temporary error, restart node'."

网络搜索了一下，发现问题可能同提交的这个bug有关，链接https://bugs.mysql.com/bug.php?id=22316。

system restart fails as you dont start all 4 nodes fast enough...

With default setting you have 30s for allowing nodes to get in contact with each other.

实际原因为，各个节点之间启动时间差太久，造成集群数据节点数不够而不能启动起来。

最后，使用xshell的To all sessions功能发送ndbmtd命令，同时启动四个节点，正常恢复集群运行。

--EOF--

码农公寓

相关文章