MySQL5.6.37版本,某人在测试环境主库误操作执行删表操作,导致主从断开,在从库查看主从信息如下:
Last_Errno: 1837 Last_Error: Worker 3 failed executing transaction ‘‘ at master log mysql-bin.013343, end_log_pos 289330740; Error ‘When @@SESSION.GTID_NEXT is set to a GTID, you must explicitly set it to a different value after a COMMIT or ROLLBACK. Please check GTID_NEXT variable manual page for detailed explanation. Current @@SESSION.GTID_NEXT is ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902‘.‘ on query. Default database: ‘DBNAME‘. Query: ‘DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2,TABLENAME_3,TABLENAME_4,TABLENAME_5,TABLENAME_6...
执行删表操作怎么可能会导致主从断开,问后知道是通过某工具误操作导致了删表,并立刻停止了。
故障可能原因:
1、create table table_name as select * from table_name; 会拆分成 creat table 和 insert 两个事务,传到slave时,slave执行完create table以后,没有in sert的GTID,于是报错 2、MyISAM 存储引擎,myisam引擎支持insert delayed语法,insert delay是异步写入,也就是一旦执行立即返回给客户端成功。mysql内部处理insert delay时,会将多个线 程的insert合并后一起执行,但是只生成了一个GTID;于是传到slave后,由于是myisam表,从库的同样只能执行第一条SQL,于是报错 3、主库innodb执行一个事务,只产生一个gtid,myisam不支持事务,事务的第一条执行完以后,第二个sql就没有gtid,于是报错 4、临时表 5、BUG
本次故障:
1、检查对应的库没有MyISAM表: select table_schema,table_name,engine from information_schema.tables where engine !=‘innodb‘ and table_schema = ‘DBNAME‘; 2、检查过enforce_gtid_consistency主从库都为on,CREATE TABLE ... SELECT语句不能执行成功,并且这次故障并不涉及CREATE TABLE ... SELECT语句,故排除 3、主从存储引擎一致 4、没有临时表
在主库审计日志查看,执行了drop schema dbname;
20200707 12:55:20 ‘/* ApplicationName=DataGrip 2020.1.4 */ drop schema DBNAME‘
查看主库 binlog :
# at 289328506 #200707 12:55:08 server id 100 end_log_pos 289328554 CRC32 0x1401e82a GTID [commit=yes] SET @@SESSION.GTID_NEXT= ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902‘/*!*/; # at 289328554 #200707 12:55:08 server id 100 end_log_pos 289329642 CRC32 0x0728afdf Query thread_id=2388454119 exec_time=12 error_code=0 SET TIMESTAMP=1594097708/*!*/; SET @@session.sql_mode=270532608/*!*/; /*!\C utf8mb4 *//*!*/; SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33/*!*/; DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2,TABLENAME_3,TABLENAME_4,....../*!*/; # at 289329642 #200707 12:55:08 server id 100 end_log_pos 289330740 CRC32 0x0d0122e4 Query thread_id=2388454119 exec_time=12 error_code=0 SET TIMESTAMP=1594097708/*!*/; DROP TABLE IF EXISTS TABLENAME_5,TABLENAME_6,TABLENAME_7,TABLENAME_8,....../*!*/; # at 289330740 #200707 12:55:08 server id 100 end_log_pos 289331832 CRC32 0xd8409afa Query thread_id=2388454119 exec_time=12 error_code=0 SET TIMESTAMP=1594097708/*!*/; DROP TABLE IF EXISTS TABLENAME_9,TABLENAME_10,TABLENAME_11,TABLENAME_12,....../*!*/; # at 289331832 #200707 12:55:08 server id 100 end_log_pos 289332298 CRC32 0xa6657cc5 Query thread_id=2388454119 exec_time=12 error_code=0 SET TIMESTAMP=1594097708/*!*/; DROP TABLE IF EXISTS TABLENAME_13,TABLENAME_14,TABLENAME_15,TABLENAME_16,....../*!*/; # at 289332298 #200707 12:55:20 server id 100 end_log_pos 289332346 CRC32 0x0cc19e83 GTID [commit=yes] SET @@SESSION.GTID_NEXT= ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462903‘/*!*/;
查看从库 binlog :
# at 19856236 #200707 12:55:08 server id 100 end_log_pos 19856284 CRC32 0x5e42595e GTID [commit=yes] SET @@SESSION.GTID_NEXT= ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902‘/*!*/; DROP TABLE IF EXISTS TABLENAME_1,TABLENAME_2,TABLENAME_3,TABLENAME_4,......# at 19857398 #200707 12:55:23 server id 100 end_log_pos 19857446 CRC32 0x81998bd5 GTID [commit=yes] SET @@SESSION.GTID_NEXT= ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463065‘/*!*/; # at 19857446 #200707 12:55:23 server id 100 end_log_pos 19857509 CRC32 0x0916a5e2 Query thread_id=2388456043 exec_time=0 error_code=0 SET TIMESTAMP=1594097723/*!*/; SET @@session.sql_mode=524288/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; BEGIN /*!*/; # at 19857509 #200707 12:55:23 server id 100 end_log_pos 19859435 CRC32 0x2eec3842 Rows_query # insert into talename.... ...... ...... #200707 12:55:23 server id 100 end_log_pos 19860287 CRC32 0xc983ec89 Xid = 5405680966 COMMIT/*!*/; # at 19860287 #200707 12:55:08 server id 100 end_log_pos 19861411 CRC32 0x20f2bc78 Query thread_id=2388454119 exec_time=1669 error_code=0 SET TIMESTAMP=1594097708/*!*/; SET @@session.sql_mode=270532608/*!*/; /*!\C utf8mb4 *//*!*/; SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33/*!*/; DROP TABLE IF EXISTS TABLENAME_5,TABLENAME_6,TABLENAME_7,TABLENAME_8,....../*!*/; # at 19861411 #200707 12:55:08 server id 100 end_log_pos 19862529 CRC32 0x35204dfe Query thread_id=2388454119 exec_time=1672 error_code=0 SET TIMESTAMP=1594097708/*!*/; DROP TABLE IF EXISTS TABLENAME_9,TABLENAME_10,TABLENAME_11,TABLENAME_12,....../*!*/; # at 19862529 #200707 12:55:23 server id 100 end_log_pos 19862577 CRC32 0x48b02b33 GTID [commit=yes] SET @@SESSION.GTID_NEXT= ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463066‘/*!*/;
1、因为数据库开启了GTID复制,每一个GTID需要与一个唯一的事务对应,"drop schema dbname;" 在从库将删表语句拆分成了多个语句
2、查看主从GTID,发现从库GTID缺少了从‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462903‘ 至 ‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463064‘信息
3、从库在GTID为‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902‘时执行了一个DROP TABLE IF EXISTS语句后,直接进入GTID为‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618463065‘ 执行insert into语句并COMMIT
4、COMMIT后,正常需要设置不同的@@SESSION.GTID_NEXT,但是没有,而是再次执行DROP TABLE IF EXISTS语句。事务在GTID为‘07fd3067-b250-11e7-a2f0-1866da9e4b15:2618462902‘后发生了异常拆分,所以主从复制发生报错