考虑以下两个功能:
ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C)
ROW_NUMBER() OVER (PARTITION BY B,A ORDER BY C)
据我了解,它们产生的结果完全相同。换句话说,在PARTITION BY
子句中列出列的顺序无关紧要。
如果有索引,(A,B,C)
我希望优化程序在两个变体中都使用此索引。
但是,令人惊讶的是,优化器决定在第二个变体中进行额外的显式排序。
我已经在SQL Server 2008 Standard和SQL Server 2014 Express上看到了它。
这是我用来复制它的完整脚本。
在Microsoft SQL Server 2014上尝试-12.0.2000.8(X64)2014年2月20日20:04:26版权所有(c)Windows NT 6.1(Build 7601:Service Pack 1)上的Microsoft Corporation Express Edition(64位)
和Microsoft SQL Server 2014(SP1-CU7)(KB3162659)-12.0.4459.0(X64)2016年5月27日15:33:17版权所有(c)Windows NT 6.1(内部版本7601)上的Microsoft Corporation Express Edition(64位):服务包1)
通过使用新旧基数估计OPTION (QUERYTRACEON 9481)
和OPTION (QUERYTRACEON 2312)
。
设置表格,索引,样本数据
CREATE TABLE [dbo].[T](
[ID] [int] IDENTITY(1,1) NOT NULL,
[A] [int] NOT NULL,
[B] [int] NOT NULL,
[C] [int] NOT NULL,
CONSTRAINT [PK_T] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_ABC] ON [dbo].[T]
(
[A] ASC,
[B] ASC,
[C] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON)
GO
INSERT INTO [dbo].[T] ([A],[B],[C]) VALUES
(10, 20, 30),
(10, 21, 31),
(10, 21, 32),
(10, 21, 33),
(11, 20, 34),
(11, 21, 35),
(11, 21, 36),
(12, 20, 37),
(12, 21, 38),
(13, 21, 39);
查询
SELECT -- AB
ID,A,B,C
,ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C) AS rnAB
FROM T
ORDER BY C
OPTION(RECOMPILE);
SELECT -- BA
ID,A,B,C
,ROW_NUMBER() OVER (PARTITION BY B,A ORDER BY C) AS rnBA
FROM T
ORDER BY C
OPTION(RECOMPILE);
SELECT -- both
ID,A,B,C
,ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C) AS rnAB
,ROW_NUMBER() OVER (PARTITION BY B,A ORDER BY C) AS rnBA
FROM T
ORDER BY C
OPTION(RECOMPILE);
执行计划
按A,B划分
按B,A划分
都
如您所见,第二个计划还有一个额外的排序。它按B,A,C订购。显然,优化器不够智能,无法意识到与数据PARTITION BY B,A
相同PARTITION BY A,B
并对其重新排序。
有趣的是,第三个查询具有两个变体,ROW_NUMBER
并且没有多余的排序!该计划与第一个查询的计划相同。(序列项目在“输出列表”中有额外的列,但没有额外的排序)。因此,在这种更为复杂的情况下,优化器似乎足够聪明,以至于意识到PARTITION BY B,A
与相同PARTITION BY A,B
。
在第一个和第三个查询中,索引扫描运算符具有属性Ordered:True,在第二个查询中为False。
更有趣的是,如果我这样重写第三个查询(交换两列):
SELECT -- both
ID,A,B,C
,ROW_NUMBER() OVER (PARTITION BY B,A ORDER BY C) AS rnBA
,ROW_NUMBER() OVER (PARTITION BY A,B ORDER BY C) AS rnAB
FROM T
ORDER BY C
OPTION(RECOMPILE);
然后多余的排序再次出现!
有人可以照亮吗?优化器在这里发生了什么?