(Check out my Pluralsight online training course: SQL Server: Detecting and Correcting Database Corruption.)
This is a question that I was sent over email that I thought would make a good post, and I’m waiting to do a user group presentation in Australia at 2am, so this will keep me busy :-)
Imagine you come to work in the morning and notice that some new rows have been entered into the msdb.dbo.suspect_pages table during the night. Usually the first thing you’re going to do is run DBCC CHECKDB, but if your database is a few TB, that could be several hours before you know where the problem is, and which table may have lost data. You want to find out which table is involved as soon as possible so you can explore your disaster recovery options.
Another scenario is troubleshooting a poorly performing query workload, running my script to look at the currently waiting threads using sys.dm_os_waiting_tasks, seeing a lot of PAGELATCH_EX waits and needing to figure out which table is involved from the page ID in the resource_description column in the sys.dm_os_waiting_tasks output.
Going back to the first scenario, getting the data from the suspect_pages table is easy:
1
2
|
SELECT * FROM [msdb] . [dbo] . [suspect_pages] ;
GO |
database_id file_id page_id event_type error_count last_update_date ----------- ----------- -------------------- ----------- ----------- ----------------------- 6 1 295 2 2 2014-09-25 01:18:22.910 |
Finding the table name requires first using DBCC PAGE. The syntax for DBCC PAGE is:
dbcc page ( {'dbname' | dbid}, filenum, pagenum [, printopt={0|1|2|3} ]) |
You can just use print option 0, as that just displays the page’s header. You also must enable trace flag 3604 to get any output from DBCC PAGE – it’s perfectly safe. So taking the values from our suspect_pages output, that gives us:
1
2
3
4
|
DBCC TRACEON ( 3604 ) ;
DBCC PAGE ( 6 , 1 , 295 , 0 ) ;
DBCC TRACEOFF ( 3604 ) ;
GO |
PAGE: (1:295) BUFFER: BUF @0x00000004FD8C7980 bpage = 0x00000004A2D14000 bhash = 0x0000000000000000 bpageno = (1:295) bdbid = 6 breferences = 0 bcputicks = 0 bsampleCount = 0 bUse1 = 55116 bstat = 0x809 blog = 0x15ab215a bnext = 0x0000000000000000 PAGE HEADER: Page @0x00000004A2D14000 m_pageId = (1:295) m_headerVersion = 17 m_type = 17 m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x8200 m_objId (AllocUnitId.idObj) = 84 m_indexId (AllocUnitId.idInd) = 256 Metadata: AllocUnitId = 72057594043432960 Metadata: PartitionId = 72057594039042048 Metadata: IndexId = 0 Metadata: ObjectId = 245575913 m_prevPage = (0:0) m_nextPage = (0:0) pminlen = 8008 m_slotCnt = 1 m_freeCnt = 83 m_freeData = 8107 m_reservedCnt = 0 m_lsn = (35:200:9) m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0 m_tornBits = 1093512791 DB Frag ID = 1 Allocation Status GAM (1:2) = ALLOCATED SGAM (1:3) = ALLOCATED PFS (1:1) = 0x64 MIXED_EXT ALLOCATED 100_PCT_FULL DIFF (1:6) = CHANGED ML (1:7) = NOT MIN_LOGGED |
We’re interested in the output beginning Metadata:. Those fields are not stored on the page itself. When I rewrote DBCC PAGE for SQL Server 2005, I added the Metadata:output to make it easier to find the object and index ID that the page is part of (as these used to be the m_objId and m_indexId fields in SQL Server 7.0 and 2000).
The Metadata: ObjectId field is what we want. If you see it is 99, then stop as that means the damaged page is part of the allocation system and not part of a table and you’ll need to wait for DBCC CHECKDB to complete to know the extent of the damage.
If you see the ObjectId is 0, that means there was no metadata found. This could be because:
- The table that the page was part of has been deleted since the page corruption was logged
- The system catalogs are corrupt in some way
- The page is corrupt and so incorrect values were used to look up the metadata
In any case, you’ll need to wait for DBCC CHECKDB to complete to know the extent of the damage.
If the ObjectId is not 0 or 99, we can plug it into the OBJECT_NAME function to give us the name of the table:
1
2
|
SELECT OBJECT_NAME ( 245575913 ) ;
GO |
-------------------------------------------------------------------------------------------------------------------------------- NULL |
If you get the result above, then there are two possibilities:
- You are in the wrong database context
- The metadata for the database is corrupt, so wait for DBCC CHECKDB to complete
It’s most likely #1 that’s the problem, at least in my experience with helping people out. You can get the database name by taking the database_id in the suspect_pagesoutput and plugging it into the DB_NAME function. Go into the correct database context and try again.
1
2
3
4
|
USE [company] ;
GO SELECT OBJECT_NAME ( 245575913 ) ;
GO |
-------------------------------------------------------------------------------------------------------------------------------- CustomerNames |
So there you go – hope this helps!
PS1 Kenneth Fisher commented with some neat code that will do the job on 2012 and 2014 using the new sys.dm_db_database_page_allocations DMF – check it out. That will work as long as there aren’t any metadata or allocation bitmap corruptions.
PS2 Wayne Sheffield reminded me over email and in a comment that DBCC PAGE doesn’t need 3604 if you use the WITH TABLERESULTS option and programmatically crack the DBCC PAGE results.
Related Posts
- Corruption: Last resorts that people try first…
- EMERGENCY-mode repair: the very, very last resort
- Disaster recovery 101: fixing metadata corruption without a backup
- Using the Dedicated Admin Connection to fix Msg 8992: corrupt system tables
- Creating, detaching, re-attaching, and fixing a SUSPECT database
命令语法信息
So what’s the syntax?
dbcc traceon(3604)
dbcc page ( {‘dbname’ | dbid}, filenum, pagenum [, printopt={0|1|2|3} ])
The filenum and pagenum parameters are taken from the page IDs that come from various system tables and appear in DBCC or other system error messages. A page ID of, say, (1:354) has filenum = 1 and pagenum = 354.
The printopt parameter has the following meanings:
- 0 – print just the page header
- 1 – page header plus per-row hex dumps and a dump of the page slot array (unless its a page that doesn’t have one, like allocation bitmaps)
- 2 – page header plus whole page hex dump
- 3 – page header plus detailed per-row interpretation