如何避免在WHERE子句中使用变量


16

给定一个(简化的)存储过程,例如:

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

如果Sale表很大,则SELECT可能要花很长时间才能执行,这显然是因为优化器由于局部变量而无法优化。我们测试了SELECT使用变量运行零件,然后对日期进行了硬编码,执行时间从〜9分钟缩短至〜1秒。

我们有许多存储过程可根据“固定”日期范围(周,月,8周等)进行查询,因此输入参数仅为@endDate,而@startDate是在过程内部计算的。

问题是,避免在WHERE子句中使用变量以免损害优化程序的最佳实践是什么?

下面显示了我们提出的可能性。这些最佳实践中的任何一种还是其他方法?

使用包装程序将变量转换为参数。

参数不会像局部变量那样影响优化器。

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
   DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
   EXECUTE DateRangeProc @startDate, @endDate
END

CREATE PROCEDURE DateRangeProc(@startDate DATE, @endDate DATE)
AS
BEGIN
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

使用参数化的动态SQL。

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  DECLARE @sql NVARCHAR(4000) = N'
    SELECT
      -- Stuff
    FROM Sale
    WHERE SaleDate BETWEEN @startDate AND @endDate
  '
  DECLARE @param NVARCHAR(4000) = N'@startDate DATE, @endDate DATE'
  EXECUTE sp_executesql @sql, @param, @startDate = @startDate, @endDate = @endDate
END

使用“硬编码”动态SQL。

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  DECLARE @sql NVARCHAR(4000) = N'
    SELECT
      -- Stuff
    FROM Sale
    WHERE SaleDate BETWEEN @startDate AND @endDate
  '
  SET @sql = REPLACE(@sql, '@startDate', CONVERT(NCHAR(10), @startDate, 126))
  SET @sql = REPLACE(@sql, '@endDate', CONVERT(NCHAR(10), @endDate, 126))
  EXECUTE sp_executesql @sql
END

DATEADD()直接使用该功能。

我并不热衷于此,因为在WHERE中调用函数也会影响性能。

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN DATEADD(DAY, -6, @endDate) AND @endDate
END

使用可选参数。

我不确定分配参数是否会遇到与分配变量相同的问题,因此这可能不是一个选择。我不是很喜欢这种解决方案,但是为了完整起见,将其包括在内。

CREATE PROCEDURE WeeklyProc(@endDate DATE, @startDate DATE = NULL)
AS
BEGIN
  SET @startDate = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

-更新-

感谢您的建议和评论。阅读它们之后,我对各种方法进行了一些时序测试。我在这里添加结果作为参考。

运行1没有计划。运行2紧随运行1之后,并具有完全相同的参数,因此它将使用运行1中的计划。

NoProc时间用于在存储过程之外的SSMS中手动运行SELECT查询。

TestProc1-7是来自原始问题的查询。

TestProcA-B基于Mikael Eriksson的建议。数据库中的列是DATE,因此我尝试将参数作为DATETIME传递,并使用隐式强制转换(testProcA)和显式强制转换(testProcB)运行。

TestProcC-D基于肯尼斯·费舍尔Kenneth Fisher)的建议。我们已经将日期查询表用于其他用途,但是对于每个期间范围,我们都没有一个带有特定列的表。我尝试的变体仍使用BETWEEN,但在较小的查找表上执行此操作,然后加入较大的表。我将进一步调查我们是否可以使用特定的查找表,尽管我们的周期是固定的,但有很多不同的周期。

    销售表中的总行数:136,424,366

                       运行1(毫秒)运行2(毫秒)
    步骤CPU占用时间CPU占用时间注释
    NoProc常数6567 62199 2870 719手动查询常数
    NoProc变量9314 62424 3993 998手动查询变量
    testProc1 6801 62919 2871 736硬编码范围
    testProc2 8955 63190 3915 979参数和变量范围
    testProc3 8985 63152 3932 987参数范围的包装程序
    testProc4 9142 63939 3931 977参数化动态SQL
    testProc5 7269 62933 2933 728硬编码动态SQL
    testProc6 9266 63421 3915 984在DATE使用DATEADD
    testProc7 2044 13950 1092 1087虚拟参数
    testProcA 12120 61493 5491 1875在DATETIME上使用DATEADD而不使用CAST
    testProcB 8612 61949 3932 978在DATETIME和CAST上使用DATEADD
    testProcC 8861 61651 3917 993使用查询表,首先出售
    testProcD 8625 61740 3994 1031使用查询表,售完

这是测试代码。

------ SETUP ------

IF OBJECT_ID(N'testDimDate', N'U') IS NOT NULL DROP TABLE testDimDate
IF OBJECT_ID(N'testProc1', N'P') IS NOT NULL DROP PROCEDURE testProc1
IF OBJECT_ID(N'testProc2', N'P') IS NOT NULL DROP PROCEDURE testProc2
IF OBJECT_ID(N'testProc3', N'P') IS NOT NULL DROP PROCEDURE testProc3
IF OBJECT_ID(N'testProc3a', N'P') IS NOT NULL DROP PROCEDURE testProc3a
IF OBJECT_ID(N'testProc4', N'P') IS NOT NULL DROP PROCEDURE testProc4
IF OBJECT_ID(N'testProc5', N'P') IS NOT NULL DROP PROCEDURE testProc5
IF OBJECT_ID(N'testProc6', N'P') IS NOT NULL DROP PROCEDURE testProc6
IF OBJECT_ID(N'testProc7', N'P') IS NOT NULL DROP PROCEDURE testProc7
IF OBJECT_ID(N'testProcA', N'P') IS NOT NULL DROP PROCEDURE testProcA
IF OBJECT_ID(N'testProcB', N'P') IS NOT NULL DROP PROCEDURE testProcB
IF OBJECT_ID(N'testProcC', N'P') IS NOT NULL DROP PROCEDURE testProcC
IF OBJECT_ID(N'testProcD', N'P') IS NOT NULL DROP PROCEDURE testProcD
GO

CREATE TABLE testDimDate
(
   DateKey DATE NOT NULL,
   CONSTRAINT PK_DimDate_DateKey UNIQUE NONCLUSTERED (DateKey ASC)
)
GO

DECLARE @dateTimeStart DATETIME = '2000-01-01'
DECLARE @dateTimeEnd DATETIME = '2100-01-01'
;WITH CTE AS
(
   --Anchor member defined
   SELECT @dateTimeStart FullDate
   UNION ALL
   --Recursive member defined referencing CTE
   SELECT FullDate + 1 FROM CTE WHERE FullDate + 1 <= @dateTimeEnd
)
SELECT
   CAST(FullDate AS DATE) AS DateKey
INTO #DimDate
FROM CTE
OPTION (MAXRECURSION 0)

INSERT INTO testDimDate (DateKey)
SELECT DateKey FROM #DimDate ORDER BY DateKey ASC

DROP TABLE #DimDate
GO

-- Hard coded date range.
CREATE PROCEDURE testProc1 AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
END
GO

-- Parameter and variable date range.
CREATE PROCEDURE testProc2(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Parameter date range.
CREATE PROCEDURE testProc3a(@startDate DATE, @endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Wrapper procedure.
CREATE PROCEDURE testProc3(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   EXEC testProc3a @startDate, @endDate
END
GO

-- Parameterized dynamic SQL.
CREATE PROCEDURE testProc4(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   DECLARE @sql NVARCHAR(4000) = N'SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate'
   DECLARE @param NVARCHAR(4000) = N'@startDate DATE, @endDate DATE'
   EXEC sp_executesql @sql, @param, @startDate = @startDate, @endDate = @endDate
END
GO

-- Hard coded dynamic SQL.
CREATE PROCEDURE testProc5(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   DECLARE @sql NVARCHAR(4000) = N'SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN ''@startDate'' AND ''@endDate'''
   SET @sql = REPLACE(@sql, '@startDate', CONVERT(NCHAR(10), @startDate, 126))
   SET @sql = REPLACE(@sql, '@endDate', CONVERT(NCHAR(10), @endDate, 126))
   EXEC sp_executesql @sql
END
GO

-- Explicitly use DATEADD on a DATE.
CREATE PROCEDURE testProc6(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN DATEADD(DAY, -1, @endDate) AND @endDate
END
GO

-- Dummy parameter.
CREATE PROCEDURE testProc7(@endDate DATE, @startDate DATE = NULL) AS
BEGIN
   SET NOCOUNT ON
   SET @startDate = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Explicitly use DATEADD on a DATETIME with implicit CAST for comparison with SaleDate.
-- Based on the answer from Mikael Eriksson.
CREATE PROCEDURE testProcA(@endDateTime DATETIME) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN DATEADD(DAY, -1, @endDateTime) AND @endDateTime
END
GO

-- Explicitly use DATEADD on a DATETIME but CAST to DATE for comparison with SaleDate.
-- Based on the answer from Mikael Eriksson.
CREATE PROCEDURE testProcB(@endDateTime DATETIME) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN CAST(DATEADD(DAY, -1, @endDateTime) AS DATE) AND CAST(@endDateTime AS DATE)
END
GO

-- Use a date lookup table, Sale first.
-- Based on the answer from Kenneth Fisher.
CREATE PROCEDURE testProcC(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale J INNER JOIN testDimDate D ON D.DateKey = J.SaleDate WHERE D.DateKey BETWEEN @startDate AND @endDate
END
GO

-- Use a date lookup table, Sale last.
-- Based on the answer from Kenneth Fisher.
CREATE PROCEDURE testProcD(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM testDimDate D INNER JOIN Sale J ON J.SaleDate = D.DateKey WHERE D.DateKey BETWEEN @startDate AND @endDate
END
GO

------ TEST ------

SET STATISTICS TIME OFF

DECLARE @endDate DATE = '2012-12-10'
DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)

DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

RAISERROR('Run 1: NoProc with constants', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
SET STATISTICS TIME OFF

RAISERROR('Run 2: NoProc with constants', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
SET STATISTICS TIME OFF

DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

RAISERROR('Run 1: NoProc with variables', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
SET STATISTICS TIME OFF

RAISERROR('Run 2: NoProc with variables', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
SET STATISTICS TIME OFF

DECLARE @sql NVARCHAR(4000)

DECLARE _cursor CURSOR LOCAL FAST_FORWARD FOR
   SELECT
      procedures.name,
      procedures.object_id
   FROM sys.procedures
   WHERE procedures.name LIKE 'testProc_'
   ORDER BY procedures.name ASC

OPEN _cursor

DECLARE @name SYSNAME
DECLARE @object_id INT

FETCH NEXT FROM _cursor INTO @name, @object_id
WHILE @@FETCH_STATUS = 0
BEGIN
   SET @sql = CASE (SELECT COUNT(*) FROM sys.parameters WHERE object_id = @object_id)
      WHEN 0 THEN @name
      WHEN 1 THEN @name + ' ''@endDate'''
      WHEN 2 THEN @name + ' ''@startDate'', ''@endDate'''
   END

   SET @sql = REPLACE(@sql, '@name', @name)
   SET @sql = REPLACE(@sql, '@startDate', CONVERT(NVARCHAR(10), @startDate, 126))
   SET @sql = REPLACE(@sql, '@endDate', CONVERT(NVARCHAR(10), @endDate, 126))

   DBCC FREEPROCCACHE WITH NO_INFOMSGS
   DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

   RAISERROR('Run 1: %s', 0, 0, @sql) WITH NOWAIT
   SET STATISTICS TIME ON
   EXEC sp_executesql @sql
   SET STATISTICS TIME OFF

   RAISERROR('Run 2: %s', 0, 0, @sql) WITH NOWAIT
   SET STATISTICS TIME ON
   EXEC sp_executesql @sql
   SET STATISTICS TIME OFF

   FETCH NEXT FROM _cursor INTO @name, @object_id
END

CLOSE _cursor
DEALLOCATE _cursor

Answers:


9

参数嗅探几乎一直都是您的朋友,您应该编写查询以便可以使用它。参数嗅探可使用编译查询时可用的参数值帮助您为您制定计划。参数嗅探的阴暗面是当编译查询时使用的值对于以后的查询不是最佳的。

存储过程中的查询是在执行存储过程时而不是在执行查询时编译的,因此SQL Server必须在此处处理的值...

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

是的已知值@endDate和的未知值@startDate。这样一来,SQL Server便可以根据筛选器根据@startDate统计信息得出的结果,猜测为筛选器返回的行的30%@endDate。如果您的大型表有很多行,可以为您提供扫描操作,那么您将从搜索中受益最多。

您的包装过程解决方案确保SQL Server在DateRangeProc编译时可以看到这些值,以便它可以为@endDate和使用已知值@startDate

您的两个动态查询都导致相同的结果,这些值在编译时已知。

默认值为空的那个有点特殊。在编译时已知的SQL Server的值是一个已知的价值@endDatenull@startDatenull在之间使用a 将给您0行,但是在这种情况下,SQL Server总是猜测为1。在这种情况下,这可能是一件好事,但如果您以较大的日期间隔调用存储过程,而扫描本来是最好的选择,那么它可能最终会进行大量查找。

我将“直接使用DATEADD()函数”留在此答案的末尾,因为它是我要使用的那个,并且它也有一些奇怪之处。

首先,SQL Server在where子句中使用该函数时不会多次调用该函数。DATEADD被视为运行时常量

而且我认为DATEADD在编译查询时会对它进行评估,这样您就可以很好地估计返回的行数。但这不是这种情况。
无论您做什么DATEADD(在SQL Server 2012上测试),SQL Server 都会根据参数中的值进行估算,因此,在您的情况下,估算值将为上注册的行数@endDate。我不知道为什么这么做,但与数据类型的使用有关DATEDATETIME在存储过程和表中移至,估计将是准确的,这意味着DATEADD在编译时将for视为DATETIMEnot DATE

因此,总结这个相当冗长的答案,我将推荐包装程序解决方案。它将始终允许SQL Server在编译查询时使用提供的值,而无需使用动态SQL。

PS:

在评论中,您有两个建议。

OPTION (OPTIMIZE FOR UNKNOWN)OPTION (RECOMPILE)由于每次都会重新编译查询,因此您将获得大约9%的返回行的估计值,并使SQL Server看到参数值。


3

好的,我有两种可能的解决方案。

首先,我想知道这是否可以增加参数设置。我还没有机会进行测试,但它可能有效。

CREATE PROCEDURE WeeklyProc(@endDate DATE, @startDate DATE)
AS
BEGIN
  IF @startDate IS NULL
    SET @startDate = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

另一种选择是利用您使用固定时间范围的事实。首先创建一个DateLookup表。像这样

CurrentDate    8WeekStartDate    8WeekEndDate    etc

填写从现在到下个世纪的每个日期。这只有约36500行,因此是一个相当小的表。然后像这样更改您的查询

IF @Range = '8WeekRange' 
    SELECT
      -- Stuff
    FROM Sale
    JOIN DateLookup
        ON SaleDate BETWEEN [8WeekStartDate] AND [8WeekEndDate]
    WHERE DateLookup.CurrentDate = GetDate()

显然,这只是一个例子,当然可以写得更好,但是我对这种类型的表有很多运气。特别是因为它是静态表并且可以像疯狂一样被索引。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.