测试服mysql突然崩溃

一.背景

又是一个阳光明媚的周五,天气很好,本来平坦的一天突然出现一阵聒噪。测试和开发纷纷开始抱怨测试服挂了,连不上了,于是我们开始看问题在哪,看服务日志发现,是mysql服务连接超时。

二.排查

于是上mysql所在的服务器查看原因,然后发现mysql挂掉了,输入重启命令后提示以下信息

[root@mysql etc]# service mysqld restart
Redirecting to /bin/systemctl restart mysqld.service
Job for mysqld.service failed because the control process exited with error code. See "systemctl status mysqld.service" and "journalctl -xe" for details.

[root@mysql etc]# systemctl status mysqld.service
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Fri 2021-09-10 16:25:29 CST; 7s ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 111951 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=1/FAILURE)
  Process: 111928 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 7119 (code=exited, status=2)

Sep 10 16:25:29  systemd[1]: Failed to start MySQL Server.
Sep 10 16:25:29  systemd[1]: Unit mysqld.service entered failed state.
Sep 10 16:25:29  systemd[1]: mysqld.service failed.
Sep 10 16:25:29  systemd[1]: mysqld.service holdoff time over, scheduling restart.
Sep 10 16:25:29  systemd[1]: Stopped MySQL Server.
Sep 10 16:25:29  systemd[1]: start request repeated too quickly for mysqld.service
Sep 10 16:25:29  systemd[1]: Failed to start MySQL Server.
Sep 10 16:25:29  systemd[1]: Unit mysqld.service entered failed state.
Sep 10 16:25:29  systemd[1]: mysqld.service failed.

于是去mysql日志发现以下信息:

2021-09-10 15:03:25 0x7f497e8c5700  InnoDB: Assertion failure in thread 139953632466688 in file ut0ut.cc line 942
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
07:03:25 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=438
max_threads=1000
thread_count=405
connection_count=405
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 405574 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f4970035690
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f497e8c4e30 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xf07d6b]
/usr/sbin/mysqld(handle_fatal_signal+0x461)[0x7b97f1]
/lib64/libpthread.so.0(+0xf5d0)[0x7f49ad4495d0]
/lib64/libc.so.6(gsignal+0x37)[0x7f49abe33207]
/lib64/libc.so.6(abort+0x148)[0x7f49abe348f8]
/usr/sbin/mysqld[0x789b14]
/usr/sbin/mysqld(_ZN2ib5fatalD1Ev+0xfd)[0x10dea2d]
/usr/sbin/mysqld(_Z18os_file_flush_funci+0x220)[0xfc1c60]
/usr/sbin/mysqld(_Z9fil_flushm+0x28a)[0x11a0d2a]
/usr/sbin/mysqld(_Z15log_write_up_tomb+0xa0e)[0xf9e6ae]
/usr/sbin/mysqld(_Z29trx_commit_complete_for_mysqlP5trx_t+0x76)[0x10c74b6]
/usr/sbin/mysqld[0xf48a0d]
/usr/sbin/mysqld(_Z13ha_commit_lowP3THDbb+0x8c)[0x80667c]
/usr/sbin/mysqld(_ZN12TC_LOG_DUMMY6commitEP3THDb+0x14)[0xd79264]
/usr/sbin/mysqld(_Z15ha_commit_transP3THDbb+0x131)[0x806f21]
/usr/sbin/mysqld(_Z17trans_commit_stmtP3THD+0x32)[0xd7afc2]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x39f2)[0xcd0ec2]
/usr/sbin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3dd)[0xcd3dad]
/usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xaba)[0xcd494a]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x19f)[0xcd63df]
/usr/sbin/mysqld(handle_connection+0x290)[0xd98320]
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0x1280284]
/lib64/libpthread.so.0(+0x7dd5)[0x7f49ad441dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f49abefaead]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f4970114880): INSERT INTO healthy_week_main_energy  ( id, entity_id, entity_name, sub_entity_id, fat_standard, fat_actual, protein_standard, protein_actual, protein_prop_standard, animal_protein_prop_actual, legume_protein_prop_actual, dark_vegetable_standard, dark_vegetable_actual, sync_time )  VALUES  ( '79adad9dd1c2b92533c42b2e64cc01f5', 'C538173C9C7B46F188EDE6B29A555FF1', '                        ', 'default', '25-30', '0.00', '19', '50', '50', '0.00', '0.00', '50', '41.50', '2021-09-10 14:59:43.65' )
Connection ID (thread ID): 124715
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

赶紧开始百度

文章地址:http://www.linuxeye.com/database/2830.html (MySQL意外断电,InnoDB数据库恢复)

分析日志后发现,数据库无法重启的原因是因为ibdata1文件损坏,重启后无法正常恢复。
现在我们就需要跳过恢复步骤,修改my.cnf文件,在my.cnf中的[mysqld]中添加

innodb_force_recovery = 6
innodb_purge_threads = 0

innodbforcerecovery可以设置为1-6,大的数字包含前面所有数字的影响。
\1. (SRVFORCEIGNORECORRUPT):忽略检查到的corrupt页。
\2. (SRVFORCENOBACKGROUND):阻止主线程的运行,如主线程需要执行full purge操作,会导致crash。
\3. (SRVFORCENOTRXUNDO):不执行事务回滚操作。
\4. (SRVFORCENOIBUFMERGE):不执行插入缓冲的合并操作。
\5. (SRVFORCENOUNDOLOGSCAN):不查看重做日志,InnoDB存储引擎会将未提交的事务视为已提交。
\6. (SRVFORCENOLOG_REDO):不执行前滚的操作。

于是去my.conf加入以上配置,再重启问题解决

三.总结

后来询问了同事,确实刚刚重启了一次机房的主机。断电导致了Innodb的文件损坏,所以以后断电的时候还是要注意一下。幸好mysql有自己的恢复机制

上一篇:高级IO函数


下一篇:Docker inspect 命令