问题背景:
客户反馈凌晨三点sql server宕机,需要排查宕机原因
1> sql server 日志如下:
日志如下:
1 09/29/2019 23:38:00,Logon,未知,Login failed for user ‘oasa‘. 原因: 无法打开明确指定的数据库“dbname”。 [客户端: xx.x.xx.xxx] 2 3 09/29/2019 23:38:00,Logon,未知,错误: 18456,严重性: 14,状态: 38。 4 5 09/29/2019 23:37:59,Server,未知,This instance of SQL Server last reported using a process ID of 1956 at 2019/9/29 23:36:38 (local) 2019/9/29 15:36:38 (UTC). This is an informational message only; no user action is required. 6 7 09/29/2019 23:37:18,Server,未知,Using conventional memory in the memory manager. 8 9 09/29/2019 23:37:18,Server,未知,Detected 65535 MB of RAM. This is an informational message; no user action is required. 10 11 09/29/2019 23:37:17,Server,未知,Microsoft SQL Server 2012 - 11.0.5058.0 (X64) <nl/> May 14 2014 18:34:29 <nl/> Copyright (c) Microsoft Corporation<nl/> Enterprise Edition (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor) 12 13 09/29/2019 23:36:39,spid13s,未知,The SQL Server Network Interface library could not deregister the Service Principal Name (SPN) [ MSSQLSvc/KWG-KG-OADB01.kwgproperty.com:1433 ] for the SQL Server service. Error: 0xffffffff<c/> state: 63. Administrator should deregister this SPN manually to avoid client authentication errors. 14 15 09/29/2019 23:36:39,spid13s,未知,The SQL Server Network Interface library could not deregister the Service Principal Name (SPN) [ MSSQLSvc/KWG-KG-OADB01.kwgproperty.com ] for the SQL Server service. Error: 0xffffffff<c/> state: 63. Administrator should deregister this SPN manually to avoid client authentication errors. 16 17 09/29/2019 23:36:39,spid7s,未知,SQL Trace was stopped due to server shutdown. Trace ID = ‘1‘. This is an informational message only; no user action is required. 18 19 09/29/2019 23:36:39,spid7s,未知,SQL Server shutdown has been initiated 20 21 09/29/2019 23:36:38,spid7s,未知,.NET Framework runtime has been stopped. 22 23 09/29/2019 23:36:38,spid21s,未知,The current event was not reported to the Windows Events log. Operating system error = (null). You may need to clear the Windows Events log if it is full. 24 25 09/29/2019 23:36:38,spid21s,未知,Error: 17054<c/> Severity: 16<c/> State: 1. 26 27 09/29/2019 23:36:38,spid21s,未知,Service Broker manager has shut down. 28 29 09/29/2019 23:36:38,Server,未知,SQL Server is terminating because of a system shutdown. This is an informational message only. No user action is required. 关键报错 30 31 09/29/2019 04:00:13,spid414,未知,AppDomain 2 (mssqlsystemresource.dbo[runtime].1) created. 32 33 09/29/2019 03:09:56,Server,未知,Software Usage Metrics is enabled. 34 35 09/29/2019 03:09:29,spid7s,未知,Launched startup procedure ‘sp_MSrepl_startup‘. 36 37 09/29/2019 03:09:29,spid109,未知,Using ‘xplog70.dll‘ version ‘2011.110.5058‘ to execute extended store 38 39 40 41 SQL Server is terminating because of a system shutdown 关键的报错信息
2> sql server宕机是由于系统引起的,查看Windows server的系统日志
怀疑和服务器的自动检查更新有关,关闭自动检查更新,数据库服务正常运行十几天无告警,基本可以确认是由于服务器自动更新引起的数据库宕机。