1.前言
之前主要讲解了关于MGR的一些工作原理以及限制,那么今天这里主要操作MGR
2.环境准备
192.168.31.201:3307
192.168.31.201:3308
192.168.31.201:3309
说明:这里主要搭建的是单机多实例单主模式的MGR环境
3.配置文件参数(必须要有的)
1 [mysqld] 2 datadir=/data 3 socket=/data/mysql.sock 4 5 server-id=100 # 必须 6 gtid_mode=on # 必须 7 enforce_gtid_consistency=on # 必须 8 log-bin=/data/master-bin # 必须 9 binlog_format=row # 必须 10 binlog_checksum=none # 必须 11 master_info_repository=TABLE # 必须 12 relay_log_info_repository=TABLE # 必须 13 relay_log=/data/relay-log # 必须,如果不给,将采用默认值 14 log_slave_updates=ON # 必须 15 sync-binlog=1 # 建议 16 log-error=/data/error.log 17 pid-file=/data/mysqld.pid 18 19 transaction_write_set_extraction=XXHASH64 # 必须 20 loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" # 必须 21 loose-group_replication_start_on_boot=off # 建议设置为OFF 22 loose-group_replication_member_weigth = 40 # 非必需,mysql 5.7.20才开始支持该选项 23 loose-group_replication_local_address="192.168.31.201:33071" # 必须,下一行也必须 24 loose-group_replication_group_seeds="192.168.31.201:33071,192.168.201.33081,192.168.31.201:33091" ##这里填写组成复制组的所有ip地址和端口哈,注意这里的端口号和Mysql的端口号是不一样的
说明:
想要使用组复制,要求还是挺多的。分析一下上面的配置选项:
- (1).因为组复制基于GTID,所以必须开启gtid_mode和enforce_gtid_consistency。
- (2).组复制必须开启二进制日志,且必须设置为行格式的二进制日志,这样才能从日志记录中收集信息且保证数据一致性。所以设置log_bin和binlog_format。
- (3).由于MySQL对复制事件校验的设计缺陷,组复制不能对他们校验,所以设置binlog_checksum=none。
- (4).组复制要将master和relay log的元数据写入到mysql.slave_master_info和mysql.slave_relay_log_info中。
- (5).组中的每个节点都保留了完整的数据副本,它是share-nothing的模式。所以所有节点上都必须开启log_slave_updates,这样新节点随便选哪个作为donor都可以进行异步复制。
- (6).sync_binlog是为了保证每次事务提交都立刻将binlog刷盘,保证出现故障也不丢失日志。
- (7).最后的6行是组复制插件的配置。以
loose_
开头表示即使启动组复制插件,MySQL也继续正常允许下去。这个前缀是可选的。 - (8).倒数第6行表示写集合以
XXHASH64
的算法进行hash。所谓写集,是对事务中所修改的行进行的唯一标识,在后续检测并发事务之间是否修改同一行冲突时使用。它基于主键生成,所以使用组复制,表中必须要有主键。 - (9).倒数第5行表示这个复制组的名称。它必须是一个有效的UUID值。嫌可以直接和上面一样全写字母a。在Linux下,可以使用
uuidgen
工具来生成UUID值。
[root@xuexi ~]# uuidgen 09c38ef2-7d81-463e-bdb4-9459b2c0e49b
- (10).倒数第4行表示组复制功能不随MySQL实例启动而启动。虽然,可以将组复制插件和启动组复制功能的选项写在配置文件里,但强烈建议不要如此,而是每次手动去配置。
- (11).倒数第3行表示该节点在组中的权重为40。权重越高,自动选举为primary节点的优先级就越高。
- (12).倒数第2行表示本机上用于组内各节点之间通信的地址和端口。
- (13).最后一行,设置本组的种子节点。种子节点的意义在前文已经解释过了。
4.部署
Master上操作:
#首先创建复制用户,并授予replication slave权限 SET SQL_LOG_BIN=0; CREATE USER rpl_user@'%'; GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'rpl_pass'; FLUSH PRIVILEGES; SET SQL_LOG_BIN=1; #创建一个复制通道channel CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery'; #安装group_replication的插件plugin INSTALL PLUGIN group_replication SONAME 'group_replication.so'; # 设置group_replication_bootstrap_group为ON是为了标示以后加入集群的服务器以这台服务器为基准,以后加入的就不需要设置 SET GLOBAL group_replication_bootstrap_group=ON; #开启group_replication
set global slave_preserve_commit_order=1; ##这里需要注意,如果当开启组复制的时候报错,那么需要在开启之前执行该行命令
start group_replication;
SET GLOBAL group_replication_bootstrap_group=OFF;
#查看MGR的状态 select *from performance_schema.replication_group_members;
Slave上操作
#首先创建复制用户,并授予replication slave权限 SET SQL_LOG_BIN=0; CREATE USER rpl_user@'%'; GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%' IDENTIFIED BY 'rpl_pass'; FLUSH PRIVILEGES; SET SQL_LOG_BIN=1; #创建一个复制通道channel CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='rpl_pass' FOR CHANNEL 'group_replication_recovery'; #安装group_replication的插件plugin INSTALL PLUGIN group_replication SONAME 'group_replication.so'; # 这里不再需要开启group_replication_bootstrap_group,由于复制组已经被创建了,只需要将第二个节点添加进去即可 set global group_replication_allow_local_disjoint_gtids_join=on; #开启group_replication start group_replication; SET GLOBAL group_replication_bootstrap_group=OFF; #查看MGR的状态 select *from performance_schema.replication_group_members;
特别需要注意的是,Master配置中,需要将参数group_replication_bootstrap_group设置为on,设置group_replication_bootstrap_group为ON是为了标示以后加入集群的服务器以这台服务器为基准,以后加入的就不需要设置,而Slave中需要将group_replication_allow_local_disjoint_gtids_join设置为on,允许当前服务器加入该组,即使该组中没有事务。如果不添加这个参数,日志中将会给出下面的提示:Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option',这一点是Master和Slave搭建时候的重要区别。
5.性能测试
5.1查看状态
root@localhost 17:53: [(none)]> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | ONLINE | | group_replication_applier | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 | ONLINE | | group_replication_applier | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | ONLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+
5.2 切换测试
查看切换之前的主节点
root@localhost 18:32: [(none)]> select variable_name,member_id,member_host,member_port from performance_schema.global_status a,performance_schema.replication_group_members b where a.variable_value=b.member_id; +----------------------------------+--------------------------------------+----------------+-------------+ | variable_name | member_id | member_host | member_port | +----------------------------------+--------------------------------------+----------------+-------------+ | group_replication_primary_member | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 |
停止主节点3307,可以看到其状态是offline
root@localhost 18:33: [(none)]> stop group_replication; Query OK, 0 rows affected (9.39 sec) root@localhost 18:33: [(none)]> root@localhost 18:33: [(none)]> root@localhost 18:33: [(none)]> select *from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 | OFFLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+
在从节点上查看当前group的状态:
root@localhost 18:34: [(none)]> select *from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | ONLINE | | group_replication_applier | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | ONLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+
查看当前group的主节点,我们可以看到此时主节点发生了变化,3309升级成为主节点:
root@localhost 18:37: [(none)]> select variable_name,member_id,member_host,member_port from performance_schema.global_status a,performance_schema.replication_group_members b where a.variable_value=b.member_id; +----------------------------------+--------------------------------------+----------------+-------------+ | variable_name | member_id | member_host | member_port | +----------------------------------+--------------------------------------+----------------+-------------+ | group_replication_primary_member | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | +----------------------------------+--------------------------------------+----------------+-------------+
恢复3307,重新加入到group_replication中,此时查看group_replication的状态
root@localhost 18:39: [(none)]> start group_replication; Query OK, 0 rows affected (36.03 sec) root@localhost 18:40: [(none)]> select *from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | ONLINE | | group_replication_applier | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 | ONLINE | | group_replication_applier | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | ONLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+
在查看group_replication中,主节点状态
root@localhost 18:41: [(none)]> select variable_name,member_id,member_host,member_port from performance_schema.global_status a,performance_schema.replication_group_members b where a.variable_value=b.member_id; +----------------------------------+--------------------------------------+----------------+-------------+ | variable_name | member_id | member_host | member_port | +----------------------------------+--------------------------------------+----------------+-------------+ | group_replication_primary_member | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | +----------------------------------+--------------------------------------+----------------+-------------+
结论:当MGR中的主节点宕机时,会重新选择新的master,当旧的master恢复加入后,新master不会发生改变。
6.节点选举的问题
下面的切换测试中,我的主节点时3309,按照下面的权重值进行设置,当我们停掉3309的时候,系统会选举3308来作为新的master,因为它的权重值比3307要重,所以我们可以通过设置某个节点的权重来指定我们想要选举的新master,如下:
节点3307 mysql--root@localhost:(none) 18:47:24>>set global group_replication_member_weight=45; Query OK, 0 rows affected (0.00 sec) 节点3308 mysql--root@localhost:(none) 18:35:18>>set global group_replication_member_weight=50; Query OK, 0 rows affected (0.00 sec) 节点3309 mysql--root@localhost:(none) 18:24:46>>set global group_replication_member_weight=40; Query OK, 0 rows affected (0.00 sec)
注意:这里的3307是我们第一次搭建时的基准节点,因此当搭建完MGR后,它一般就是我们的主节点,而当主节点宕机后,系统会根据3308和3309的权重来进行选举新的主节点,那个权重值大就会选举那个节点作为主节点
root@localhost 18:52: [(none)]> select variable_name,member_id,member_host,member_port from performance_schema.global_status a,performance_schema.replication_group_members b where a.variable_value=b.member_id; +----------------------------------+--------------------------------------+----------------+-------------+ | variable_name | member_id | member_host | member_port | +----------------------------------+--------------------------------------+----------------+-------------+ | group_replication_primary_member | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | +----------------------------------+--------------------------------------+----------------+-------------+ 1 row in set (0.00 sec) root@localhost 18:52: [(none)]> select *from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 | ONLINE | | group_replication_applier | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | ONLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+
7.自增列测试
这个是主3307上的自增信息
root@localhost 23:14: [liulin]> show variables like '%auto_incr%'; +--------------------------------------------+-------+ | Variable_name | Value | +--------------------------------------------+-------+ | auto_increment_increment | 1 | | auto_increment_offset | 1 | | group_replication_auto_increment_increment | 7 | +--------------------------------------------+-------+
经过我的测试,这里应该主要看的是参数:auto_increment_increment 表示的是自增步长的配置信息,auto_increment_increment,在GROUP中范围在1-9(因为一个GROUP最多只能有9个组成员),GROUP中安装的时候,默认为7;
如果我们想要修改该自增步长时,需要我们先把复制给停掉,然后才能修改参数
8.节点的接入
启动另外一个服务,端口为3310,目录结构和group当中的节点保持一致,配置文件名称为my.cnf,需要注意的是s4中需要在loose-group_replication_local_address和loose-group_replication_group_seeds参数处添加该节点的信息。s4的配置文件如下: 该配置文件只是参考用的,实际中要修改成你自己的路径和数值
[mysqld] # server configuration datadir=/data/data_mgr/s4 basedir=/usr/local/mysql-5.7.25-linux-glibc2.12-x86_64 port=24804 socket=/data/data_mgr/s4/s4.sock server_id=24804 gtid_mode=ON enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE log_slave_updates=ON log_bin=binlog binlog_format=ROW transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="1bb1b861-f776-11e6-be42-782bcb377193" ##这里要和其它三个节点的配置文件一样 loose-group_replication_start_on_boot=off loose-group_replication_local_address= "127.0.0.1:33101" loose-group_replication_group_seeds= "127.0.0.1:33071,127.0.0.1:33081,127.0.0.1:33091,127.0.0.1:33101" loose-group_replication_bootstrap_group= off
先对数据库服务进行初始化,后续启动服务,安装插件,设置相关参数:主要看步骤
#先对数据库服务进行初始化: /usr/local/mysql-5.7.25-linux-glibc2.12-x86_64/bin/mysqld --defaults-file=/data/data_mgr/s4/s4.cnf --basedir=/usr/local/mysql-5.7.25-linux-glibc2.12-x86_64 --datadir=/data/data_mgr/s4 --initialize-insecure & #然后启动数据库服务: /bin/sh /usr/local/mysql-5.7.25-linux-glibc2.12-x86_64/bin/mysqld_safe --defaults-file=/data/data_mgr/s4/s4.cnf #安装必要的插件: INSTALL PLUGIN group_replication SONAME 'group_replication.so'; #设置相关的group_ set global group_replication_allow_local_disjoint_gtids_join=on;
该实例启动好了之后,需要在group的其他三个节点上面设置全局变量,group_replication_group_seeds的值,使得group中所有的成员参数如下:
主要是通过:set global group_replication_group_seeds='192.168.31.201:33071,192.168.31.201:33081,192.168.31.201:33091,192.168.31.201:33101'; 进行设置
mysql--root@localhost:mgr 19:57:19>>select @@group_replication_group_seeds; +-----------------------------------------------------------------+ | @@group_replication_group_seeds | +-----------------------------------------------------------------+ | 127.0.0.1:33071,127.0.0.1:33081,127.0.0.1:33091,127.0.0.1:33101 | +-----------------------------------------------------------------+
然后将这个节点按照上述slave操作操作一遍即可,注意:向这样操作的话,它会去向主节点拉取binlog日志
root@localhost 00:57: [liulin]> select*from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | +---------------------------+--------------------------------------+----------------+-------------+--------------+ | group_replication_applier | 469bb8bb-04bf-11ec-9900-000c29395ab1 | 192.168.31.201 | 3309 | ONLINE | | group_replication_applier | 5a412d15-04bc-11ec-95d2-000c29395ab1 | 192.168.31.201 | 3307 | ONLINE | | group_replication_applier | 5fba9da1-04bd-11ec-a1d9-000c29395ab1 | 192.168.31.201 | 3308 | ONLINE | | group_replication_applier | da0af5a2-04f1-11ec-99ea-000c29395ab1 | 192.168.31.201 | 3310 | ONLINE | +---------------------------+--------------------------------------+----------------+-------------+--------------+