我们的SQL Server之一最近报告了以下错误:
DATE/TIME: 2/25/2013 9:15:14 PM
DESCRIPTION: No catalog entry found for partition ID 9079262474267394048
in database 2. The metadata is inconsistent. Run DBCC CHECKDB to check for
a metadata corruption.
不到15分钟后,我连接到服务器并运行:
SELECT name
FROM sys.databases
WHERE database_id = 2;
哪个返回'tempdb'。然后我跑了:
DBCC CHECKDB ('tempdb') WITH NO_INFOMSGS, TABLERESULTS;
其中未返回任何结果,表明受影响的数据库没有问题。
数据库中的损坏如何导致上面的错误消息而又DBCC CHECKDB
未报告问题?我假设页面校验和计算失败,导致该页面被标记为怀疑无法引用该页面的任何对象,但是我一定是错误的。
将页面标记为“可疑”后,如何将其标记为“不可怀疑”,“固定”或“可重复使用”,或者标记为DBCC CHECKDB
不报告所涉及页面的任何问题?
编辑:2013-02-27 13:24
只是为了好玩,我试图在#temp表是罪魁祸首的情况下在TempDB中重新创建损坏。
但是,由于无法SINGLE_USER
在TempDB中设置该选项,因此无法用于DBCC WRITEPAGE
破坏页面,因此无法在TempDB中强制破坏。
DBCC WRITEPAGE
可以使用一个十六进制编辑器来修改db文件中的随机字节,而不是使用一个脱机数据库。当然,这在TempDB上都不起作用,因为数据库引擎无法与TempDB脱机运行。
如果停止实例,则在下次启动时会自动重新创建TempDB;否则,将重新创建TempDB。因此,那也不会成功。
如果有人能想到重现这种腐败的方法,我愿意做进一步的研究。
为了检验以下假设:DROP TABLE
我创建了一个测试数据库并使用以下脚本破坏了一个页面,无法修复已损坏的页面,然后尝试删除受影响的表。结果是该表无法删除;我不得不RESTORE DATABASE Testdb PAGE = ''...
为了恢复受影响的页面。我假设如果我对所讨论页面的其他部分进行了更改,则该页面可能已使用DROP TABLE
或进行了更正TRUNCATE table
。
/* ********************************************* */
/* ********************************************* */
/* DO NOT USE THIS CODE ON A PRODUCTION SYSTEM!! */
/* ********************************************* */
/* ********************************************* */
USE Master;
GO
ALTER DATABASE test SET RECOVERY FULL;
BACKUP DATABASE Test
TO DISK = 'Test_db.bak'
WITH FORMAT
, INIT
, NAME = 'Test Database backup'
, SKIP
, NOREWIND
, NOUNLOAD
, COMPRESSION
, STATS = 1;
BACKUP LOG Test
TO DISK = 'Test_log.bak'
WITH FORMAT
, INIT
, NAME = 'Test Log backup'
, SKIP
, NOREWIND
, NOUNLOAD
, COMPRESSION
, STATS = 1;
GO
ALTER DATABASE test SET SINGLE_USER;
GO
USE Test;
GO
IF EXISTS (SELECT name FROM sys.key_constraints WHERE name = 'PK_temp')
ALTER TABLE temp DROP CONSTRAINT PK_temp;
IF EXISTS (SELECT name FROM sys.default_constraints
WHERE name = 'DF_temp_testdata')
ALTER TABLE temp DROP CONSTRAINT DF_temp_testdata;
IF EXISTS (SELECT name FROM sys.tables WHERE name = 'temp')
DROP TABLE temp;
GO
CREATE TABLE temp
(
tempID INT NOT NULL CONSTRAINT PK_temp PRIMARY KEY CLUSTERED IDENTITY(1,1)
, testdata uniqueidentifier CONSTRAINT DF_temp_testdata DEFAULT (NEWID())
);
GO
/* insert 10 rows into #temp */
INSERT INTO temp default values;
GO 10
/* get some necessary parameters */
DECLARE @partitionID bigint;
DECLARE @dbid smallint;
DECLARE @tblid int;
DECLARE @indexid int;
DECLARE @pageid bigint;
DECLARE @offset INT;
DECLARE @fileid INT;
SELECT @dbid = db_id('Test')
, @tblid = t.object_id
, @partitionID = p.partition_id
, @indexid = i.index_id
FROM sys.tables t
INNER JOIN sys.partitions p ON t.object_id = p.object_id
INNER JOIN sys.indexes i on t.object_id = i.object_id
WHERE t.name = 'temp';
SELECT TOP(1) @fileid = file_id
FROM sys.database_files;
SELECT TOP(1) @pageid = allocated_page_page_id
FROM sys.dm_db_database_page_allocations(@dbid, @tblid, null, @partitionID, 'LIMITED')
WHERE allocation_unit_type = 1;
/* get a random offset into the 8KB page */
SET @offset = FLOOR(rand() * 8192);
SELECT @offset;
/* 0x75 below is the letter 't' */
DBCC WRITEPAGE (@dbid, @fileid, @pageid, @offset, 1, 0x74, 1);
SELECT * FROM temp;
Msg 824, Level 24, State 2, Line 36
SQL Server detected a logical consistency-based I/O error: incorrect checksum
(expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page
(1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer
\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'. Additional messages in the SQL
Server error log or system event log may provide more detail. This is a
severe error condition that threatens database integrity and must be
corrected immediately. Complete a full database consistency check
(DBCC CHECKDB). This error can be caused by many factors; for more
information, see SQL Server Books Online.
此时,您已与数据库引擎断开连接,因此请重新连接以继续。
USE Test;
DBCC CHECKDB WITH NO_INFOMSGS, TABLERESULTS;
此处报告了腐败。
DROP TABLE temp;
Msg 824, Level 24, State 2, Line 36
SQL Server detected a logical consistency-based I/O error: incorrect checksum
(expected: 0x298b2ce9; actual: 0x2ecb2ce9). It occurred during a read of page
(1:1054) in database ID 7 at offset 0x0000000083c000 in file 'C:\SQLServer
\MSSQL11.MSSQLSERVER\MSSQL\DATA\Test.mdf'. Additional messages in the SQL
Server error log or system event log may provide more detail. This is a
severe error condition that threatens database integrity and must be
corrected immediately. Complete a full database consistency check
(DBCC CHECKDB). This error can be caused by many factors; for more
information, see SQL Server Books Online.
此处报告了腐败,DROP TABLE
失败。
/* assuming ENTERPRISE or DEVELOPER edition of SQL Server,
I can use PAGE='' to restore a single page from backup */
USE Master;
RESTORE DATABASE Test PAGE = '1:1054' FROM DISK = 'Test_db.bak';
BACKUP LOG Test TO DISK = 'Test_log_1.bak';
RESTORE LOG Test FROM DISK = 'Test_log.bak';
RESTORE LOG Test FROM DISK = 'Test_log_1.bak';
编辑#2,添加请求的@@ VERSION信息。
SELECT @@VERSION;
返回值:
Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64)
Oct 19 2012 13:38:57
Copyright (c) Microsoft Corporation
Enterprise Evaluation Edition (64-bit) on Windows NT 6.2 <X64>
(Build 9200: )
我知道这是评估版,我们拥有企业版的密钥,并将很快进行版本升级。
-T 3609
将在开始时保留tempdb(未记录,但已知)