对于主从架构的mysql,当发生主机断电或者其他原因异常crash的时候, slave的容易发生读取binlog出错的问题,最常见的是
show slave status \G;
Master_Log_File: mysql-bin.000029
Read_Master_Log_Pos: 3154083
Relay_Log_File: relay-bin.000478
Relay_Log_Pos: 633
Relay_Master_Log_File: mysql-bin.000027
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 234663436
依据错误描述提示,显然slave sql 进程读取不到relay log。
解决该问题之前先了解几个参数:
mysql的主从复制的原理可知,slave 的sql线程理论上来说是延迟于IO线程,show slave status 查询时 Relay_Master_Log_File和Master_Log_File文件显示的不是同一个文件。
Master_Log_File
The name of the master binary log file from which the I/O thread is currently reading.
slave的IO线程当前正在读取的master二进制日志文件名。
Relay_Master_Log_File
The name of the master binary log file containing the most recent event executed by the SQL thread.
slave的Sql线程最近执行的master二进制日志文件名。(该文件有可能是滞后于IO线程正在读取的二进制日志文件)
Read_Master_Log_Pos
The position in the current master binary log file up to which the I/O thread has read.
Exec_Master_Log_Pos
The position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. You can use this value with the CHANGE MASTER TO statement's MASTER_LOG_POS option when starting a new slave from an existing slave, so that the new slave reads from this point. The coordinates given by (Relay_Master_Log_File, Exec_Master_Log_Pos) in the master's binary log correspond to the coordinates given by (Relay_Log_File, Relay_Log_Pos) in the relay log.
slave的Sql线程已经读并且执行的master二进制日志文件的位置,标记下一个被执行的事务或事件的开始位置。
你可以将该值应用于两台slave演变为主从结构的应用场景中,新的slave可以在change master to语句中使用该值作为master_log_pos选项的值。master二进制日志文件的(Relay_Master_Log_File, Exec_Master_Log_Pos) 的坐标对应于slave中继日志(Relay_Log_File,Relay_Log_Pos) 坐标.
#!/bin/bash
#created by yangyi
[ -z "$1" ] && exit 0 || PORT=$1
repair_1594()
{
local portlist=$1
for my_port in $portlist
do
Last_SQL_Errno=$(mysql -uroot -h127.0.0.1 -P${my_port} -Ae"show slave status \G" 2>/dev/null | grep Last_SQL_Errno | awk '{print $2}' )
echo ${Last_SQL_Errno}
Master_Host=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Master_Host |awk '{print $2}'`
Relay_Master_Log_File=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Relay_Master_Log_File |awk '{print $2}'`
Exec_Master_Log_Pos=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Exec_Master_Log_Pos |awk '{print $2}'`
sql="change master to master_host='${Master_Host}',master_port=$PORT, master_user='rep',master_password='yangyi@rac1',master_log_file='${Relay_Master_Log_File}',master_log_pos=${Exec_Master_Log_Pos};"
mysql -uroot -h127.0.0.1 -P$PORT -e " stop slave ; sleep 1; ${sql} ;start slave ;"
sleep 1
is_OK=`mysql -uroot -h127.0.0.1 -P$PORT -p123456 -e "show slave status \G"| grep Seconds_Behind_Master | awk '{print $2}'`
if [[ ${is_OK} -ge 0 ]];
then
echo "instance : $my_port is recovered !!!!'"
else
echo "instance : $my_port is not OK,PLS CHECK WITH MANUL !!!!'"
fi
done
}
repair_1594 $PORT
#created by yangyi
[ -z "$1" ] && exit 0 || PORT=$1
repair_1594()
{
local portlist=$1
for my_port in $portlist
do
Last_SQL_Errno=$(mysql -uroot -h127.0.0.1 -P${my_port} -Ae"show slave status \G" 2>/dev/null | grep Last_SQL_Errno | awk '{print $2}' )
echo ${Last_SQL_Errno}
Master_Host=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Master_Host |awk '{print $2}'`
Relay_Master_Log_File=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Relay_Master_Log_File |awk '{print $2}'`
Exec_Master_Log_Pos=`mysql -uroot -h127.0.0.1 -P${PORT} -Ae"show slave status \G" | grep Exec_Master_Log_Pos |awk '{print $2}'`
sql="change master to master_host='${Master_Host}',master_port=$PORT, master_user='rep',master_password='yangyi@rac1',master_log_file='${Relay_Master_Log_File}',master_log_pos=${Exec_Master_Log_Pos};"
mysql -uroot -h127.0.0.1 -P$PORT -e " stop slave ; sleep 1; ${sql} ;start slave ;"
sleep 1
is_OK=`mysql -uroot -h127.0.0.1 -P$PORT -p123456 -e "show slave status \G"| grep Seconds_Behind_Master | awk '{print $2}'`
if [[ ${is_OK} -ge 0 ]];
then
echo "instance : $my_port is recovered !!!!'"
else
echo "instance : $my_port is not OK,PLS CHECK WITH MANUL !!!!'"
fi
done
}
repair_1594 $PORT
exit 0