在连续八周无错误之后,过去几天我们三度遇到了这个奇怪的错误,我很沮丧。
这是错误消息:
Executing the query "EXEC dbo.MergeTransactions" failed with the following error: "Cannot insert duplicate key row in object 'sales.Transactions' with unique index 'NCI_Transactions_ClientID_TransactionDate'. The duplicate key value is (1001, 2018-12-14 19:16:29.00, 304050920).".
我们拥有的索引不是唯一的。如果您注意到,错误消息中的重复键值甚至不会与索引对齐。奇怪的是,如果我重新运行proc,它将成功。
这是我发现的最新链接,但有我的问题,但没有解决方案。
关于我的场景的几件事:
- proc正在更新TransactionID(主键的一部分)-我认为这是导致错误的原因,但不知道为什么?我们将删除该逻辑。
- 在表上启用了更改跟踪
- 进行事务读取未提交
每个表有45个字段,我主要列出索引中使用的字段。我在更新语句中(不必要)更新TransactionID(集群键)。奇怪的是,直到上周,我们几个月都没有遇到任何问题。而且这只是通过SSIS偶尔发生的。
表
USE [DB]
GO
/****** Object: Table [sales].[Transactions] Script Date: 5/29/2019 1:37:49 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND type in (N'U'))
BEGIN
CREATE TABLE [sales].[Transactions]
(
[TransactionID] [bigint] NOT NULL,
[ClientID] [int] NOT NULL,
[TransactionDate] [datetime2](2) NOT NULL,
/* snip*/
[BusinessUserID] [varchar](150) NOT NULL,
[BusinessTransactionID] [varchar](150) NOT NULL,
[InsertDate] [datetime2](2) NOT NULL,
[UpdateDate] [datetime2](2) NOT NULL,
CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [DB_Data]
) ON [DB_Data]
END
GO
USE [DB]
IF NOT EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[sales].[Transactions]') AND name = N'NCI_Transactions_ClientID_TransactionDate')
begin
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
[ClientID] ASC,
[TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE) ON [DB_Data]
END
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_Units]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_Units] DEFAULT ((0)) FOR [Units]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_ISOCurrencyCode]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_ISOCurrencyCode] DEFAULT ('USD') FOR [ISOCurrencyCode]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_InsertDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_InsertDate] DEFAULT (sysdatetime()) FOR [InsertDate]
END
GO
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[sales].[DF_Transactions_UpdateDate]') AND type = 'D')
BEGIN
ALTER TABLE [sales].[Transactions] ADD CONSTRAINT [DF_Transactions_UpdateDate] DEFAULT (sysdatetime()) FOR [UpdateDate]
END
GO
临时表
same columns as the mgdata. including the relevant fields. Also has a non-unique clustered index
(
[BusinessTransactionID] [varchar](150) NULL,
[BusinessUserID] [varchar](150) NULL,
[PostalCode] [varchar](25) NULL,
[TransactionDate] [datetime2](2) NULL,
[Units] [int] NOT NULL,
[StartDate] [datetime2](2) NULL,
[EndDate] [datetime2](2) NULL,
[TransactionID] [bigint] NULL,
[ClientID] [int] NULL,
)
CREATE CLUSTERED INDEX ##workingTransactionsMG_idx ON #workingTransactions (TransactionID)
It is populated in batches (500k rows at a time), something like this
IF OBJECT_ID(N'tempdb.dbo.#workingTransactions') IS NOT NULL DROP TABLE #workingTransactions;
select fields
into #workingTransactions
from import.Transactions
where importrowid between two number ranges -- pseudocode
首要的关键
CONSTRAINT [PK_Transactions_TransactionID] PRIMARY KEY CLUSTERED
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION=PAGE) ON [Data]
) ON [Data]
非聚集索引
CREATE NONCLUSTERED INDEX [NCI_Transactions_ClientID_TransactionDate] ON [sales].[Transactions]
(
[ClientID] ASC,
[TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = PAGE)
样本更新声明
-- updates every field
update t
set
t.transactionid = s.transactionid,
t.[CityCode]=s.[CityCode],
t.TransactionDate=s.[TransactionDate],
t.[ClientID]=s.[ClientID],
t.[PackageMonths] = s.[PackageMonths],
t.UpdateDate = @UpdateDate
FROM #workingTransactions s
JOIN [DB].[sales].[Transactions] t
ON s.[TransactionID] = t.[TransactionID]
WHERE CAST(HASHBYTES('SHA2_256 ',CONCAT( S.[BusinessTransactionID],'|',S.[BusinessUserID],'|', etc)
<> CAST(HASHBYTES('SHA2_256 ',CONCAT( T.[BusinessTransactionID],'|',T.[BusinessUserID],'|', etc)
我的问题是,到底发生了什么?解决方案是什么?供参考,上面的链接提到了这一点:
在这一点上,我有一些理论:
- 该错误与内存压力或较大的并行更新计划有关,但我预计会出现其他类型的错误,到目前为止,我无法将这些孤立的零星错误的时间框架与资源不足相关联。
- UPDATE语句或数据中的错误导致对主键的实际重复违反,但是导致一些晦涩的SQL Server错误导致错误消息引用了错误的索引名称。
- 读取未提交隔离导致的脏读取导致对双插入的大型并行更新。但是ETL开发人员声称使用了默认的读提交,并且很难确切确定进程在运行时实际使用的隔离级别。
我怀疑如果我将执行计划作为一种变通办法来调整,也许是MAXDOP(1)提示或使用会话跟踪标志来禁用假脱机操作,那么错误将消失,但是尚不清楚这将如何影响性能。
版
Microsoft SQL Server 2017(RTM-CU13)(KB4466404)-14.0.3048.4(X64)2018年11月30日Windows Server 2016 Standard 10.0(Build 14393)版权所有(C)2017 Microsoft Corporation Enterprise Edition(64位) :)