TempDB中损坏的分区如何导致DBCC CHECKDB报告没有问题?


9

我们的SQL Server之一最近报告了以下错误:

DATE/TIME:  2/25/2013 9:15:14 PM

DESCRIPTION:    No catalog entry found for partition ID 9079262474267394048
     in database 2. The metadata is inconsistent. Run DBCC CHECKDB to check for
     a metadata corruption.

不到15分钟后,我连接到服务器并运行:

SELECT name
FROM sys.databases
WHERE database_id = 2;

哪个返回'tempdb'。然后我跑了:

DBCC CHECKDB ('tempdb') WITH NO_INFOMSGS, TABLERESULTS;

其中未返回任何结果,表明受影响的数据库没有问题。

数据库中的损坏如何导致上面的错误消息而又DBCC CHECKDB未报告问题?我假设页面校验和计算失败,导致该页面被标记为怀疑无法引用该页面的任何对象,但是我一定是错误的。

将页面标记为“可疑”后,如何将其标记为“不可怀疑”,“固定”或“可重复使用”,或者标记为DBCC CHECKDB不报告所涉及页面的任何问题?


编辑:2013-02-27 13:24

只是为了好玩,我试图在#temp表是罪魁祸首的情况下在TempDB中重新创建损坏。

但是,由于无法SINGLE_USER在TempDB中设置该选项,因此无法用于DBCC WRITEPAGE破坏页面,因此无法在TempDB中强制破坏。

DBCC WRITEPAGE可以使用一个十六进制编辑器来修改db文件中的随机字节,而不是使用一个脱机数据库。当然,这在TempDB上都不起作用,因为数据库引擎无法与TempDB脱机运行。

如果停止实例,则在下次启动时会自动重新创建TempDB;否则,将重新创建TempDB。因此,那也不会成功。

如果有人能想到重现这种腐败的方法,我愿意做进一步的研究。

为了检验以下假设:DROP TABLE我创建了一个测试数据库并使用以下脚本破坏了一个页面,无法修复已损坏的页面,然后尝试删除受影响的表。结果是该表无法删除;我不得不RESTORE DATABASE Testdb PAGE = ''...为了恢复受影响的页面。我假设如果我对所讨论页面的其他部分进行了更改,则该页面可能已使用DROP TABLE或进行了更正TRUNCATE table

/* ********************************************* */
/* ********************************************* */
/* DO NOT USE THIS CODE ON A PRODUCTION SYSTEM!! */
/* ********************************************* */
/* ********************************************* */
USE Master;
GO
ALTER DATABASE test SET RECOVERY FULL;
BACKUP DATABASE Test 
    TO DISK = 'Test_db.bak'
    WITH FORMAT
        , INIT
        , NAME = 'Test Database backup'
        , SKIP
        , NOREWIND
        , NOUNLOAD
        , COMPRESSION
        , STATS = 1;
BACKUP LOG Test
    TO DISK = 'Test_log.bak'
    WITH FORMAT
        , INIT
        , NAME = 'Test Log backup'
        , SKIP
        , NOREWIND
        , NOUNLOAD
        , COMPRESSION
        , STATS = 1;
GO
ALTER DATABASE test SET SINGLE_USER;
GO
USE Test;
GO
IF EXISTS (SELECT name FROM sys.key_constraints WHERE name = 'PK_temp') 
    ALTER TABLE temp DROP CONSTRAINT PK_temp;
IF EXISTS (SELECT name FROM sys.default_constraints 
    WHERE name = 'DF_temp_testdata') 
    ALTER TABLE temp DROP CONSTRAINT DF_temp_testdata;
IF EXISTS (SELECT name FROM sys.tables WHERE name = 'temp') 
DROP TABLE temp;
GO
CREATE TABLE temp
(
    tempID INT NOT NULL CONSTRAINT PK_temp PRIMARY KEY CLUSTERED IDENTITY(1,1)
    , testdata uniqueidentifier CONSTRAINT DF_temp_testdata DEFAULT (NEWID())
);
GO

/* insert 10 rows into #temp */
INSERT INTO temp default values;
GO 10 

/* get some necessary parameters */
DECLARE @partitionID bigint;
DECLARE @dbid smallint;
DECLARE @tblid int;
DECLARE @indexid int;
DECLARE @pageid bigint;
DECLARE @offset INT;
DECLARE @fileid INT;

SELECT @dbid = db_id('Test')
    , @tblid = t.object_id
    , @partitionID = p.partition_id
    , @indexid = i.index_id
FROM sys.tables t
    INNER JOIN sys.partitions p ON t.object_id = p.object_id
    INNER JOIN sys.indexes i on t.object_id = i.object_id
WHERE t.name = 'temp';

SELECT TOP(1) @fileid = file_id 
FROM sys.database_files;

SELECT TOP(1) @pageid = allocated_page_page_id 
FROM sys.dm_db_database_page_allocations(@dbid, @tblid, null, @partitionID, 'LIMITED')
WHERE allocation_unit_type = 1;

/* get a random offset into the 8KB page */
SET @offset = FLOOR(rand() * 8192);
SELECT @offset;

/* 0x75 below is the letter 't' */
DBCC WRITEPAGE (@dbid, @fileid, @pageid, @offset, 1, 0x74, 1);


SELECT * FROM temp;

Msg 824, Level 24, State 2, Line 36
SQL Server detected a logical consistency-based I/O error: incorrect checksum
 (expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page 
 (1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer
 \MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'.  Additional messages in the SQL 
 Server error log or system event log may provide more detail. This is a
 severe error condition that threatens database integrity and must be
 corrected immediately. Complete a full database consistency check
 (DBCC CHECKDB). This error can be caused by many factors; for more
 information, see SQL Server Books Online.

此时,您已与数据库引擎断开连接,因此请重新连接以继续。

USE Test;
DBCC CHECKDB WITH NO_INFOMSGS, TABLERESULTS;

此处报告了腐败。

DROP TABLE temp;

Msg 824, Level 24, State 2, Line 36
SQL Server detected a logical consistency-based I/O error: incorrect checksum
 (expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page 
 (1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer
 \MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'.  Additional messages in the SQL 
 Server error log or system event log may provide more detail. This is a
 severe error condition that threatens database integrity and must be
 corrected immediately. Complete a full database consistency check
 (DBCC CHECKDB). This error can be caused by many factors; for more
 information, see SQL Server Books Online.

此处报告了腐败,DROP TABLE失败。

/* assuming ENTERPRISE or DEVELOPER edition of SQL Server,
    I can use PAGE='' to restore a single page from backup */
USE Master;
RESTORE DATABASE Test PAGE = '1:1054' FROM DISK = 'Test_db.bak'; 
BACKUP LOG Test TO DISK = 'Test_log_1.bak';

RESTORE LOG Test FROM DISK = 'Test_log.bak';
RESTORE LOG Test FROM DISK = 'Test_log_1.bak';

编辑#2,添加请求的@@ VERSION信息。

SELECT @@VERSION;

返回值:

Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64) 
    Oct 19 2012 13:38:57 
    Copyright (c) Microsoft Corporation
    Enterprise Evaluation Edition (64-bit) on Windows NT 6.2 <X64> 
        (Build 9200: )

我知道这是评估版,我们拥有企业版的密钥,并将很快进行版本升级。


2
仅供参考,FYI -T 3609将在开始时保留tempdb(未记录,但已知
Remus Rusanu

Answers:


3

这是修复的已知问题:

FIX:当您使用SQL Server 2012时,“找不到数据库中的分区ID的目录条目”错误

假设您在Microsoft SQL Server 2012中查询tempdb.sys.allocation_units表。当您在查询中使用NOLOCK提示或查询位于READ UNCOMMITED事务隔离级别下时,您会收到以下间歇性608错误消息:

错误:608严重性:16状态:1
找不到数据库中分区的目录条目。元数据不一致。运行DBCC CHECKDB以检查元数据是否损坏

注意DBCC CHECKDB命令不会显示任何数据库损坏的迹象。

固定在:

您的版本(11.0.3000.0)是SQL Server 2012 SP1 RTM


7

运行CHECKDB反对tempdb是不一样的运行它针对的用户数据库。

MSDN

针对tempdb运行DBCC CHECKDB不会执行任何分配或目录检查,并且必须获取共享表锁才能执行表检查。这是因为出于性能原因,tempdb上不提供数据库快照。这意味着无法获得所需的事务一致性。


6

是的,但是特别是,无法在TempDB中检查目录错误。如果可能的话,应该回收SQL Server。每个MSDN:

“针对tempdb运行DBCC CHECKCATALOG不会执行任何检查。这是因为出于性能原因,tempdb上没有数据库快照。这意味着无法获得所需的事务一致性。请回收服务器以解决任何tempdb元数据问题。”

MSDB文章在这里:http : //msdn.microsoft.com/en-us/library/ms186720.aspx

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.