MySQL Group Replication(MGR)是MySQL官方于2016年12月推出的一个全新的高可用与高扩展的解决方案,MGR是基于原生复制及paxos协议的组复制技术,并以插件的方式提供,可以采取多主模式和单主模式,单主模式下,会自动选主,所有更新操作都在主上进行,多主模式下,所有server都可以同时处理更新操作。下面我们就来搭建下MGR集群(单主模式)。
环境清单
节点 |
IP |
端口 |
MySQL版本 |
Node2 |
192.168.150.133 |
3306 |
5.7.25 |
Node1 |
192.168.150.134 |
3306 |
|
Node3 |
192.168.150.135 |
3306 |
修改配置文件
gtid_mode=on #开启全局事务enforce_gtid_consistency=on #强制GTID的一致性master_info_repository = TABLE #将master_info元数据保存在系统表中relay_log_info_repository = TABLE #将relay_info元数据保存在系统表中binlog_checksum = NONE #禁用二进制日志校验log_slave_updates = ON #从库记录二进制日志binlog-format=row #以行格式记录transaction_write_set_extraction = XXHASH64 #以便在server收集写集合的同时将其记录到二进制日志。写集合基于每行的主键,并且是行更改后的唯一标识此标识将用于检测冲突。loose-group_replication_group_name = '3773a86e-0def-11eb-a478-000c29a841a1' #组名字loose-group_replication_start_on_boot = off #为了避免每次启动自动引导具有相同名称的第二个组,所以设置为OFF。loose-group_replication_bootstrap_group = off #同上loose-group_replication_local_address = "192.168.150.134:33061" #写自己主机所在IPloose-group_replication_group_seeds = "192.168.150.134:33061,192.168.150.133:33061,192.168.150.135:33061"loose-group_replication_single_primary_mode = true #开启单主模式的参数loose-group_replication_enforce_update_everywhere_checks = false #开启多主模式的参数
创建复制用户
每个节点执行,执行前建议先关闭binlog
mysql> set sql_log_bin=0;mysql> grant replication slave on *.* to repl@'%' identified by 'repl';mysql> set sql_log_bin=1;
安装组复制插件
mysql>INSTALL PLUGIN group_replication SONAME 'group_replication.so';#查看插件的安装情况mysql>show plugins;+----------------------------+----------+--------------------+----------------------+-------------+| Name | Status | Type | Library | License |+----------------------------+----------+--------------------+----------------------+-------------+| group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | PROPRIETARY |+----------------------------+----------+--------------------+----------------------+-------------+
启动mgr集群
构建组复制集群
配置通道的恢复凭据,当节点需要从其他成员恢复状态时,使用group_replication_recovery'复制通道的凭据。分布式恢复是加入组的server执行的第一步,如果未正确设置这些凭据,server将无法执行恢复过程并获得与其他组成员同步,因此会最终将无法加入组。
在每个节点上执行:
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='repl' FOR CHANNEL 'group_replication_recovery';
在主机上(node2)执行:
mysql> SET GLOBAL group_replication_bootstrap_group = ON;mysql> START GROUP_REPLICATION;mysql> SET GLOBAL group_replication_bootstrap_group = OFF;
查看集群状态:
select * from performance_schema.replication_group_members;+---------------------------+--------------------------------------+-------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+-------------+-------------+--------------+| group_replication_applier | a7828aa6-b5c3-11ea-8308-000c29a841a1 | node2 | 3306 | ONLINE |+---------------------------+--------------------------------------+-------------+-------------+--------------+
加入节点
加入node1,在node1上执行
mysql> START GROUP_REPLICATION;
加入node3,在node3上执行
mysql> START GROUP_REPLICATION;
查看集群状态
mysql> select * from performance_schema.replication_group_members;+---------------------------+--------------------------------------+-------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+-------------+-------------+--------------+| group_replication_applier | 3118d7cb-67ca-11e9-9baa-000c29d7c489 | node3 | 3306 | ONLINE || group_replication_applier | 7e6667ab-903a-11ea-ae4a-000c29d7c488 | node1 | 3306 | ONLINE || group_replication_applier | a7828aa6-b5c3-11ea-8308-000c29a841a1 | node2 | 3306 | ONLINE |+---------------------------+--------------------------------------+-------------+-------------+--------------+
验证
在主库node2执行下面的命令进行验证
mysql> create database test;Query OK, 1 row affected (0.33 sec)mysql> use testDatabase changedmysql> create table t(id int primary key auto_increment,name varchar(30),addr varchar(30));Query OK, 0 rows affected (0.07 sec)检查其他两个节点的数据。
模拟主库宕机
关闭主库
service mysqld stop
查看组成员
mysql> select * from performance_schema.replication_group_members;+---------------------------+--------------------------------------+-------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+-------------+-------------+--------------+| group_replication_applier | 3118d7cb-67ca-11e9-9baa-000c29d7c489 | node3 | 3306 | ONLINE || group_replication_applier | 7e6667ab-903a-11ea-ae4a-000c29d7c488 | node1 | 3306 | ONLINE |+---------------------------+--------------------------------------+-------------+-------------+--------------+
查看哪台服务器成为新主库,在MySQL8.0中通过replication_group_members表就能看出哪个节点是主库。
mysql> select * from performance_schema.global_status where variable_name='group_replication_primary_member' ;+----------------------------------+--------------------------------------+| VARIABLE_NAME | VARIABLE_VALUE |+----------------------------------+--------------------------------------+| group_replication_primary_member | 3118d7cb-67ca-11e9-9baa-000c29d7c489 |+----------------------------------+--------------------------------------+
问题
1. node3加入组时报错:[ERROR] Slave I/O for channel '': error connecting to master 'repl@192.168.3.117:3306' - retry-time: 60 retries: 16, Error_code: 2003
原 因:该节点之前曾作为另外一个集群的从节点。
解决方案:stop slave;reset slave;清空之前的集群信息。
2. 从节点之前已经存在数据了,搭建集群时节点之间数据不一致
解决方案:首先清空从节点的数据,从主库备份数据,在从库恢复。可以采用mysqldump或者Xtrabackup的方式备份恢复。