前面一篇博客我介绍了如何通过Orachestrator+脚本实现自动导换。
这一章,我来演示一下如何通过Orchestrator+Proxysql 实现自动导换+应用透明读写分离
总体架构
可选架构1
实际生产系统所用架构应该类似于上图:
1)应用APP,通过VIP连接ProxySQL
2)ProxySQL为了实现高可用,可部署2个。通过VIP管理,即同一时候只有一个ProxySQL是活动并接受外界请求。
当proxysql出现故障时,通过keepyalive实现导换,业务切换到另一个Porxysql。
3)MySQL Master和Slave组成高可用MySQL集群
4)Orchestrator集群负责监控MySQL服务器的导换
可选架构2
如上架构与架构1的区别是,每个应用连接自有的Proxysql。如应用/proxysql crash,则相应的proxysql+app整体不可用。
演示步骤
简化架构
为了演示简单一些,在这里我简化了很多组件。结构图如下
测试机器
如上,为了测试我用了5台虚拟机
host01 192.168.56.103
host02 192.168.56.104
host03 192.168.56.105
arch 192.168.56.130
proxysql 192.168.56.131
ProxySQL安装
ProxySQL安装点 这里.
MySQL复制拓扑图
MySQL创建用户权限
testuser*是作为proxysql读写操作的。
monitor用户是proxysql用来监控MySQL活动状态的。
create user testuser_w@‘%‘ identified by ‘Testpass1.‘; grant insert,update,delete on *.* to ‘testuser_w‘@‘%‘; create user testuser_r@‘%‘ identified by ‘Testpass1.‘; grant select on *.* to ‘testuser_r‘@‘%‘; create user testuser_rw@‘%‘ identified by ‘Testpass1.‘; grant select,insert,update,delete on *.* to ‘testuser_rw‘@‘%‘; create user ‘monitor‘@‘%‘ identified by ‘monitor‘; grant all privileges on *.* to ‘monitor‘@‘%‘;
proxysql启动并插入规则
[root@proxysql-server proxysql]# service proxysql.service start
Redirecting to /bin/systemctl start proxysql.service.service
proxysql启动后,会开启两个端口。6032默认是用来管理,6033是默认用来连接mysql
[root@proxysql-server proxysql]# netstat -nlp|grep proxysql
tcp 0 0 0.0.0.0:6032 0.0.0.0:* LISTEN 17840/proxysql
tcp 0 0 0.0.0.0:6033 0.0.0.0:* LISTEN 17840/proxysql
登录proxysql管理端口配置数据
mysql -uadmin -padmin -h 127.0.0.1 -P 6032
插入如下配置。如下是指定600 hostgroup为只读,601端口为写入。(需要注意的是我这里的配置是master也做为读取结点)
其中scheduler的插入数据是做何用的,后面再讲述。
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES (‘192.168.56.103‘,601,3306,1000,60); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES (‘192.168.56.104‘,601,3306,1000,60); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES (‘192.168.56.105‘,601,3306,1000,60); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES (‘192.168.56.103‘,600,3306,1000,0); INSERT INTO mysql_replication_hostgroups(writer_hostgroup,reader_hostgroup) values(600,601); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; insert into mysql_query_rules (username,destination_hostgroup,active) values(‘testuser_w‘,600,1); insert into mysql_query_rules (username,destination_hostgroup,active) values(‘testuser_r‘,601,1); insert into mysql_query_rules (username,destination_hostgroup,active,retries,match_digest) values(‘testuser_rw‘,601,1,3,‘^SELECT‘); LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK; insert into mysql_users (username,password,active,default_hostgroup,transaction_persistent) values (‘testuser_w‘,‘Testpass1.‘,1,600,1); insert into mysql_users (username,password,active,default_hostgroup,transaction_persistent) values (‘testuser_r‘,‘Testpass1.‘,1,601,1); insert into mysql_users (username,password,active,default_hostgroup,transaction_persistent) values (‘testuser_rw‘,‘Testpass1.‘,1,600,1); LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK; INSERT INTO scheduler (id,interval_ms,filename,arg1) values (10,2000,"/var/lib/proxysql/server_monitor.pl","-u=admin -p=admin -h=127.0.0.1 -G=601 -P=6032 --debug=0 --log=/var/lib/proxysql/server_check"); LOAD SCHEDULER TO RUNTIME;SAVE SCHEDULER TO DISK;
插入成功后,查看服务器状态
如下可以看到,所以的服务器的状态都显示为ONLINE。
mysql> select * from mysql_servers; +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 601 | 192.168.56.104 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 601 | 192.168.56.105 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 600 | 192.168.56.103 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 601 | 192.168.56.103 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
停止host01,查看导换后的变化
通过service mysqld stop,或是启动firewall,shutdwon mysql都可以。
host01无法连接后,mysql的拓扑结构变成了如下
查看proxysql状态数据的变化
如下,可以看到104的hostgroup变成了600(写入端口),host01(192.168.56.103)的状态变为了SHUNNED
mysql> select * from mysql_servers; +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 601 | 192.168.56.103 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 601 | 192.168.56.105 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 600 | 192.168.56.104 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | | 601 | 192.168.56.104 | 3306 | 0 | ONLINE | 1000 | 0 | 1000 | 60 | 0 | 0 | | +--------------+----------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ mysql> select * from stats_mysql_connection_pool where hostgroup between 600 and 601 order by hostgroup,srv_host desc; +-----------+----------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us | +-----------+----------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ | 600 | 192.168.56.104 | 3306 | ONLINE | 0 | 1 | 1 | 242 | 1 | 18 | 0 | 306 | 198 | 1282 | | 601 | 192.168.56.105 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1 | 127 | 0 | 2159 | 1397 | 550 | | 601 | 192.168.56.104 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1 | 207 | 0 | 3519 | 2277 | 1282 | | 601 | 192.168.56.103 | 3306 | SHUNNED | 0 | 0 | 0 | 28 | 1 | 0 | 0 | 0 | 0 | 389 | +-----------+----------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+ 4 rows in set (0.01 sec)
通过mysql客户端,测试当前行为
如下,写一个简单的循环语句,测试mysql的行为。如下,可以发现写入操作都只会发送到新的host02机器。
只读操作,会发送到host02和host03机器。host01已经从读取列表内排除。
[root@proxysql-server proxysql]# while true; do mysql -u testuser_w -pTestpass1. -h 127.0.0.1 -P 6033 -e "select @@hostname"; sleep 1; done mysql: [Warning] Using a password on the command line interface can be insecure. +------------+ | @@hostname | +------------+ | host02 | +------------+ mysql: [Warning] Using a password on the command line interface can be insecure. +------------+ | @@hostname | +------------+ | host02 | +------------+ ^C [root@proxysql-server proxysql]# while true; do mysql -u testuser_r -pTestpass1. -h 127.0.0.1 -P 6033 -e "select @@hostname"; sleep 1; done mysql: [Warning] Using a password on the command line interface can be insecure. +------------+ | @@hostname | +------------+ | host03 | +------------+ mysql: [Warning] Using a password on the command line interface can be insecure. +------------+ | @@hostname | +------------+ | host02 | +------------+
当host01重新启动后,会发生什么?
细心的读者,可能会发生,mysql_servers列表内,并没有排除host01。那如果host01启动后,会发生什么呢?
结果就是只读操作会继续发送到host01机器。
因为实际上host01机器在导换后,并不属于复制集群了。
如果只读操作继续发送到host01,肯定会读到旧的数据。造成应用出现问题。
那我们如何避免出现此问题呢?方法就是通过上面的脚本来把host01排除出复制集群。
后续的修复需要DBA来干预,如加入复制集群,等复制同步完成后,再开放host01可以作为读结点。