Mysql8官方分布式数据库MGR最佳实践,竟遇大坑

MGR简介
MGR是mysql Group Replication简称,中文名称是Mysql组复制,MGR是MySQL官方于2016年12月推出的一个全新的高可用与高扩展的解决方案,提供了高可用、高扩展、高可靠的MySQL集群服务,目前只支持MYSQL5.7和mysql8.0版本。
Mysql8官方分布式数据库MGR最佳实践,竟遇大坑
MGR优点

  1. 高一致性:基于原生复制和paxos协议的组复制技术。

  2. 高容错性:有自动检测机制,当出现宕机后,会自动剔除问题节点,在2N+1个节点集群中,集群 只要N +1个节点还存活着,数据库就能稳定的对外提供服务。

  3. 高扩展性:可以在线增加和移除节点。

  4. 高灵活性:可以在单主模式和多主模式*切换。

MGR缺点
MGR技术比较新,稳定性方面还需要业界验证。

Mysql8分布式数据库MGR最佳实践
环境准备
3台虚拟机搭建3节点的MGR集群,Mysql数据库用的8.0.21版本。

Mysql8官方分布式数据库MGR最佳实践,竟遇大坑
创建目录


mkdir -p /data/mysql/mysql8/{data,log,conf,run}/3309
mkdir -p /data/mysql/mysql8/log/3309/{redo,undo,relay}
chown -R mysql:mysql /data/mysql/mysql8/mysql8
chmod 750 /data/mysql/mysql8/{data,log,conf,run}

配置参数


[mysqld]
##basic settings###
server-id=1
port = 3309
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-name-resolve
max_connections=1000
max_user_connections=800
max_allowed_packet=512M
max_connect_errors=100000
datadir = /data/mysql/mysql8/data/3309
socket = /data/mysql/mysql8/run/3309/mysql.sock
pid_file = /data/mysql/mysql8/run/3309/mysql.pid
transaction_isolation = READ-COMMITTED
lower_case_table_names=1
default_time_zone =+8:00
open_files_limit=65535
log_timestamps=system
wait_timeout=900
interactive_timeout=900

##innodb setting##
innodb_buffer_pool_size = 256M
innodb_buffer_pool_instances = 1
innodb_io_capacity=2000
innodb_flush_method=O_DIRECT

innodb_log_group_home_dir = /data/mysql/mysql8/log/3309/redo
innodb_log_file_size = 128M
innodb_log_files_in_group=4
innodb_log_buffer_size = 32M

innodb_undo_directory = /data/mysql/mysql8/log/3309/undo
innodb_undo_tablespaces = 4
innodb_undo_log_truncate=1
innodb_max_undo_log_size=1G

innodb_flush_neighbors=0
innodb_flush_log_at_trx_commit = 1
innodb_print_all_deadlocks = 1
innodb_online_alter_log_max_size=128M

innodb_lock_wait_timeout=10
innodb_file_per_table=ON
innodb_doublewrite=ON

##log settings##
log-error = /data/mysql/mysql8/log/3309/error.log
log-bin = /data/mysql/mysql8/log/3309/mysql_bin.log
slow_query_log = 1
slow_query_log_file = /data/mysql/mysql8/log/3309/mysql_slow_query.log
long_query_time = 10

##replication settings##
gtid-mode=on
enforce-gtid-consistency=true
master-info-repository=TABLE
relay-log-info-repository=TABLE
sync-master-info=1000
relay_log_recovery = 1
relay-log=/data/mysql/mysql8/log/3309/relay/mysql-relay-bin

#binlog
log_bin=/data/mysql/mysql8/log/3309/binlog
expire_logs_days=10
max_binlog_cache_size=1024M
sync_binlog=1

##MGR settings
binlog_checksum = NONE
log_slave_updates = ON
binlog_format=row
#transaction_write_set_extraction ='XXHASH64'
#loose-group_replication_group_name = '38f34157-cbe8-4623-a7bd-054cc5c2de0b'
#loose-group_replication_start_on_boot = off
#loose-group_replication_local_address = '192.168.112.131:10061'
#loose-group_replication_group_seeds ='192.168.112.131:10061,192.168.112.132:10061,192.168.112.135:10061'
#loose-group_replication_bootstrap_group = off
#loose-group_replication_ip_whitelist = '192.168.112.131/24,192.168.112.132/24,192.168.112.135/24'

[client]
port = 3309
socket = /data/mysql/mysql8/run/3309/mysql.sock

在这里需要注意的是地方,由于MGR的插件,mysql默认是没有安装的,所以在这里关于MGR的配置参数,都注释掉了,等数据库实例启动之后,再开启。

每个节点的server-id需要设置成不一样。

初始化数据库


/data/mysql-8.0.21/bin/mysqld --defaults-file=/data/mysql/mysql8/conf/3309/my.cnf --initialize --basedir=/data/mysql-8.0.21/ --datadir=/data/mysql/mysql8/data/3309 --user=mysql --initialize-insecure --ssl --explicit_defaults_for_timestamp

如果要启用SSL安全连接,执行如下命令
/data/mysql-8.0.21/bin/mysql_ssl_rsa_setup --basedir=/data/mysql-8.0.21 --datadir=/data/mysql/mysql8/data/3309

启停mysql实例

/data/mysql-8.0.21/bin/mysqld_safe --defaults-file=/data/mysql/mysql8/conf/3309/my.cnf &

/data/mysql-8.0.21/bin/mysqladmin -uroot --socket=/data/mysql/mysql8/run/3309/mysql.sock shutdown &

登录实例


/data/mysql-8.0.21/bin/mysql -uroot --socket=/data/mysql/mysql8/run/3309/mysql.sock

MGR插件安装


[root@localhost] 14:08:44 [(none)]>install plugin group_replication soname 'group_replication.so';
Query OK, 0 rows affected (0.20 sec)

[root@localhost] 14:09:06 [(none)]>show plugins;
+---------------------------------+----------+--------------------+----------------------+---------+
| Name                            | Status   | Type               | Library              | License |
+---------------------------------+----------+--------------------+----------------------+---------+
| binlog                          | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| mysql_native_password           | ACTIVE   | AUTHENTICATION     | NULL                 | GPL     |
| sha256_password                 | ACTIVE   | AUTHENTICATION     | NULL                 | GPL     |
| caching_sha2_password           | ACTIVE   | AUTHENTICATION     | NULL                 | GPL     |
| sha2_cache_cleaner              | ACTIVE   | AUDIT              | NULL                 | GPL     |
| CSV                             | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| MEMORY                          | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| InnoDB                          | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| INNODB_TRX                      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP                      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_RESET                | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMPMEM                   | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMPMEM_RESET             | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_PER_INDEX            | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CMP_PER_INDEX_RESET      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_PAGE              | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_PAGE_LRU          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_BUFFER_POOL_STATS        | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_TEMP_TABLE_INFO          | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_METRICS                  | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_DEFAULT_STOPWORD      | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_DELETED               | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_BEING_DELETED         | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_CONFIG                | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_INDEX_CACHE           | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_FT_INDEX_TABLE           | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_TABLES                   | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_TABLESTATS               | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_INDEXES                  | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_TABLESPACES              | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_COLUMNS                  | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_VIRTUAL                  | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_CACHED_INDEXES           | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| INNODB_SESSION_TEMP_TABLESPACES | ACTIVE   | INFORMATION SCHEMA | NULL                 | GPL     |
| MyISAM                          | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| MRG_MYISAM                      | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| PERFORMANCE_SCHEMA              | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| TempTable                       | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| ARCHIVE                         | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| BLACKHOLE                       | ACTIVE   | STORAGE ENGINE     | NULL                 | GPL     |
| FEDERATED                       | DISABLED | STORAGE ENGINE     | NULL                 | GPL     |
| ngram                           | ACTIVE   | FTPARSER           | NULL                 | GPL     |
| mysqlx_cache_cleaner            | ACTIVE   | AUDIT              | NULL                 | GPL     |
| mysqlx                          | ACTIVE   | DAEMON             | NULL                 | GPL     |
| group_replication               | ACTIVE   | GROUP REPLICATION  | group_replication.so | GPL     |
+---------------------------------+----------+--------------------+----------------------+---------+
45 rows in set (0.00 sec)

安装好插件之后,将MGR的参数注释去掉,重启mysql实例。

设置MGR单主模式
Mysql8官方分布式数据库MGR最佳实践,竟遇大坑

在所有节点上执行以下命令,创建复制用户


set sql_log_bin=0;
create user 'repl'@'%' identified with mysql_native_password by 'repl';
grant replication slave,replication client on *.* to 'repl'@'%';
flush privileges;
set sql_log_bin=1;

在192.168.112.131服务上设置主节点


set global group_replication_single_primary_mode=on;
set global group_replication_bootstrap_group=ON;
CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='repl' FOR CHANNEL 'group_replication_recovery';
start group_replication;
set global group_replication_bootstrap_group=OFF;

设置好之后,可以查看MGR复制成员状态


[root@localhost] 14:44:39 [(none)]>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | c1a60f79-f719-11ea-97db-000c29cc2388 | mysql       |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
+---------------------------+--------------------------------------+----
---------+-------------+--------------+-------------+----------------+

设置好主之后,就开始设置剩下2个从节点了,执行命令都是一样的。


CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='repl' FOR CHANNEL 'group_replication_recovery';
start group_replication;

到这里,遇到了大坑,来看看


[root@localhost] 14:44:40 [(none)]>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 188f118c-f71b-11ea-899b-000c29387845 | mysql       |        3309 | RECOVERING   | SECONDARY   | 8.0.21         |
| group_replication_applier | 577b5cdb-f71a-11ea-9f03-000c29231183 | mysql       |        3309 | RECOVERING   | SECONDARY   | 8.0.21         |
| group_replication_applier | c1a60f79-f719-11ea-97db-000c29cc2388 | mysql       |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

2个从节点的状态一直是RECOVERING状态,正常时ONLINE状态的。这个状态持续了有10几分钟,看mysql的错误日志看到以下错误信息


2020-09-15T14:24:57.555493+08:00 19 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'group_replication_recovery': error connecting to master 'repl@mysql:3309' - retry-time: 60 retries: 1 message: Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection. Error_code: MY-002061
2020-09-15T14:24:57.557460+08:00 18 [ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'
2020-09-15T14:24:57.557541+08:00 18 [ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'
2020-09-15T14:25:57.284156+08:00 0 [System] [MY-011504] [Repl] Plugin group_replication reported: 'Group membership changed: This member has left the group.'

重点内容在error connecting to master 'repl@mysql:3309',无法连接到master,即网络不通,在这里MGR竟然用的主机名来和集群成员进行通信,关键是我的主机名都是一样的,因为虚拟级时克隆出来的。

在这里有2个解决方案
方案一
修改主机名称,并且在MGR每个集群节点上的/etc/hosts中,添加主机和IP的对应关系

方案二
让MGR集群节点用IP地址进行通信,可以通过配置以下2个参数


report_host=192.168.112.135
report_port=3309

我用的是方案二,在my.cnf参数文件中分别添加以下内容


在192.168.112.131服务器上
report_host=192.168.112.131
report_port=3309

在192.168.112.132服务器上
report_host=192.168.112.132
report_port=3309

在192.168.112.135服务器上
report_host=192.168.112.135
report_port=3309

添加完成之后,就重启Mysql实例,重新启动集群。

设置主节点

set global group_replication_bootstrap_group=ON;
start group_replication;
set global group_replication_bootstrap_group=OFF;

设置从节点

start group_replication;

查看集群状态


[root@localhost] 15:02:23 [(none)]>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST     | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 188f118c-f71b-11ea-899b-000c29387845 | 192.168.112.135 |        3309 | ONLINE       | SECONDARY   | 8.0.21         |
| group_replication_applier | 577b5cdb-f71a-11ea-9f03-000c29231183 | 192.168.112.132 |        3309 | ONLINE       | SECONDARY   | 8.0.21         |
| group_replication_applier | c1a60f79-f719-11ea-97db-000c29cc2388 | 192.168.112.131 |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)

Mysql8官方分布式数据库MGR最佳实践,竟遇大坑

这次集群节点状态都是online了。

单主切换到多主模式
在所有节点执行

stop group_replication;
set global group_replication_single_primary_mode=OFF;
set global group_replication_enforce_update_everywhere_checks=ON;

选择任意节点执行


SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

剩余节点执行


START GROUP_REPLICATION;

查看集群状态

[root@localhost] 17:45:57 [db1]>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST     | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 188f118c-f71b-11ea-899b-000c29387845 | 192.168.112.135 |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
| group_replication_applier | 577b5cdb-f71a-11ea-9f03-000c29231183 | 192.168.112.132 |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
| group_replication_applier | c1a60f79-f719-11ea-97db-000c29cc2388 | 192.168.112.131 |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.01 sec)

多主切换到多单模式
在所有节点执行

stop group_replication;
set global group_replication_enforce_update_everywhere_checks=off;

选择主节点数据库执行

set global group_replication_single_primary_mode=on;
set global group_replication_bootstrap_group=ON;
start group_replication;
set global group_replication_bootstrap_group=OFF;

剩余节点执行


set global group_replication_single_primary_mode=on;
START GROUP_REPLICATION;

查看集群状态

[root@localhost] 17:51:12 [db1]>select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST     | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 188f118c-f71b-11ea-899b-000c29387845 | 192.168.112.135 |        3309 | ONLINE       | SECONDARY   | 8.0.21         |
| group_replication_applier | 577b5cdb-f71a-11ea-9f03-000c29231183 | 192.168.112.132 |        3309 | ONLINE       | SECONDARY   | 8.0.21         |
| group_replication_applier | c1a60f79-f719-11ea-97db-000c29cc2388 | 192.168.112.131 |        3309 | ONLINE       | PRIMARY     | 8.0.21         |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+-------------+----------------+
3 rows in set (0.00 sec)
上一篇:php token的生成


下一篇:MGR用哪个版本?5.7 vs 8.0