搜寻,然后您应扫描…在分区表上


22

我已经在Itzik Ben-Gan的 PCMag中阅读了这些文章:

搜寻并您应扫描第一部分:当优化程序未优化
搜寻时,您应扫描第二部分:升序键

我目前所有分区表都遇到“最大分组”问题。我们使用Itzik Ben-Gan提供的技巧来获取max(ID),但有时它无法运行:

DECLARE @MaxIDPartitionTable BIGINT
SELECT  @MaxIDPartitionTable = ISNULL(MAX(IDPartitionedTable), 0)
FROM    ( SELECT    *
          FROM      ( SELECT    partition_number PartitionNumber
                      FROM      sys.partitions
                      WHERE     object_id = OBJECT_ID('fct.MyTable')
                                AND index_id = 1
                    ) T1
                    CROSS APPLY ( SELECT    ISNULL(MAX(UpdatedID), 0) AS IDPartitionedTable
                                  FROM      fct.MyTable s
                                  WHERE     $PARTITION.PF_MyTable(s.PCTimeStamp) = PartitionNumber
                                            AND UpdatedID <= @IDColumnThresholdValue
                                ) AS o
        ) AS T2;
SELECT @MaxIDPartitionTable 

我有这个计划

在此处输入图片说明

但是45分钟后,请看一下阅读内容

reads          writes   physical_reads
12,949,127        2       12,992,610

我摆脱了sp_whoisactive

通常,它运行非常快,但不是今天。

编辑:具有分区的表结构:

CREATE PARTITION FUNCTION [MonthlySmallDateTime](SmallDateTime) AS RANGE RIGHT FOR VALUES (N'2000-01-01T00:00:00.000', N'2000-02-01T00:00:00.000' /* and many more */)
go
CREATE PARTITION SCHEME PS_FctContractualAvailability AS PARTITION [MonthlySmallDateTime] TO ([Standard], [Standard])
GO
CREATE TABLE fct.MyTable(
    MyTableID BIGINT IDENTITY(1,1),
    [DT1TurbineID] INT NOT NULL,
    [PCTimeStamp] SMALLDATETIME NOT NULL,
    Filler CHAR(100) NOT NULL DEFAULT 'N/A',
    UpdatedID BIGINT NULL,
    UpdatedDate DATETIME NULL
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED 
(
    [DT1TurbineID] ASC,
    [PCTimeStamp] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = ROW) ON [PS_FctContractualAvailability]([PCTimeStamp])
) ON [PS_FctContractualAvailability]([PCTimeStamp])

GO

CREATE UNIQUE NONCLUSTERED INDEX [IX_UpdatedID_PCTimeStamp] ON [fct].MyTable
(
    [UpdatedID] ASC,
    [PCTimeStamp] ASC
)
INCLUDE (   [UpdatedDate]) 
WHERE ([UpdatedID] IS NOT NULL)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, DATA_COMPRESSION = ROW) ON [PS_FctContractualAvailability]([PCTimeStamp])
GO

Answers:


28

基本的问题是,索引搜索后没有Top运算符。这是一种优化,通常在搜索以正确的顺序返回行以进行MIN\MAX聚合时引入。

此优化利用了以下事实:最小/最大行是升序或降序的第一行。优化器也可能无法将这种优化应用于分区表。我忘了。

无论如何,关键是没有这种转换,执行计划最终将处理每个S.UpdatedID <= @IDColumnThresholdValue分区合格的每一行,而不是每个分区所需的一行。

您尚未在问题中提供表,索引或分区定义,因此我无法更具体地说明。您应该检查索引是否支持这种转换。同样,您也可以将表示MAXTOP (1) ... ORDER BY UpdatedID DESC

如果这导致排序(包括TopN Sort),则您知道索引没有帮助。例如:

SELECT
    @MaxIDPartitionTable = ISNULL(MAX(T2.IDPartitionedTable), 0)
FROM    
( 
    SELECT
        O.IDPartitionedTable
    FROM      
    ( 
        SELECT
            P.partition_number AS PartitionNumber
        FROM sys.partitions AS P
        WHERE 
            P.[object_id] = OBJECT_ID(N'fct.MyTable', N'U')
            AND P.index_id = 1
    ) AS T1
    CROSS APPLY 
    (    
        SELECT TOP (1) 
            S.UpdatedID AS IDPartitionedTable
        FROM fct.MyTable AS S
        WHERE
            $PARTITION.PF_MyTable(S.PCTimeStamp) = T1.PartitionNumber
            AND S.UpdatedID <= @IDColumnThresholdValue
        ORDER BY
            S.UpdatedID DESC
    ) AS O
) AS T2;

这应该产生的计划形状是:

所需平面形状

注意索引搜索下面的顶部。这将处理限制为每个分区一行。

或者,使用临时表保存分区号:

CREATE TABLE #Partitions
(
    partition_number integer PRIMARY KEY CLUSTERED
);

INSERT #Partitions
    (partition_number)
SELECT
    P.partition_number AS PartitionNumber
FROM sys.partitions AS P
WHERE 
    P.[object_id] = OBJECT_ID(N'fct.MyTable', N'U')
    AND P.index_id = 1;

SELECT
    @MaxIDPartitionTable = ISNULL(MAX(T2.UpdatedID), 0)
FROM #Partitions AS P
CROSS APPLY 
(
    SELECT TOP (1) 
        S.UpdatedID
    FROM fct.MyTable AS S
    WHERE
        $PARTITION.PF_MyTable(S.PCTimeStamp) = P.partition_number
        AND S.UpdatedID <= @IDColumnThresholdValue
    ORDER BY
        S.UpdatedID DESC
) AS T2;

DROP TABLE #Partitions;

注意事项:在查询中访问系统表可防止并行性。如果这很重要,请考虑在临时表中具体化分区号,然后再APPLY从中实现。并行处理通常对这种模式(正确的索引编制)没有帮助,但是我不提它就更不用说了。

旁注2:有一个活动的Connect项,要求对MIN\MAX聚合和分区对象的Top 内置支持。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.