查询执行非常缓慢,是否有任何方法可以进一步改善它?


9

我有以下查询,由于有很多SUM函数调用,我的查询运行太慢。我的数据库中有很多记录,我希望获得当年和去年(过去30天,过去90天和过去365天)的报告:

SELECT 
    b.id as [ID]
    ,d.[Title] as [Title]
    ,e.Class as [Class]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 30 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 30 Days Col2]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 90 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 90 Days Col2]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 365 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 365 Days Col2]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 30 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 30 Days Col2]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 90 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 90 Days Col2]

    ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 365 Days Col1]
    ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 365 Days Col2]


    FROM 
    tb1 a
INNER JOIN 
    tb2 b on a.id=b.fid and a.col3 = b.col4
INNER JOIN 
    tb3 c on b.fid = c.col5
INNER JOIN       
    tb4 d on c.id = d.col6
INNER JOIN 
    tb5 e on c.col7 = e.id
GROUP BY
    b.id, d.Title, e.Class

有谁知道如何改善查询以使其运行更快?

编辑:鼓励我将DATEADD函数调用移至该where语句,并先加载前两年,然后将其过滤在列中,但我不确定建议的答案是否已执行且有效,可以在这里找到:https:// stackoverflow。 com / a / 59944426/12536284

如果您同意上述解决方案,请告诉我如何在当前查询中应用它?

仅供参考,我在C#,实体框架(DB-First)中使用此SP,如下所示:

var result = MyDBEntities.CalculatorSP();

4
告诉我们您的执行计划...
Dale K

1
启用任何功能 -是什么会使查询变慢
Fabio


2
再次,请发布执行计划。
SQL警察

2
仍然我们没有看到Execution Plan。请张贴它
Arun Palanisamy

Answers:


10

如前所述,执行计划在这种情况下将非常有用。根据显示的内容,您似乎已从中提取了12列,总共15列tb1 (a),因此您可以尝试不进行任何联接而直接对tb1进行查询,以查看查询是否按预期工作。因为我看不到您的SUM函数调用有问题,所以我最好的猜测是您的联接有问题,我建议您执行以下操作。您可以从排除最后一个联接开始,例如, INNER JOIN tb5 e on c.col7 = e.id它的任何相关用法e.Class as [Class]以及e.Class在您的按声明分组中。我们不会完全排除它,这只是一个测试,以确保问题是否存在,如果您的查询运行得更好,并且按预期的方式,您可以尝试使用临时表作为替代方法,而不是最后一次联接,如下所示:

SELECT *
INTO #Temp
FROM
  (
     select * from tb5
  ) As tempTable;

SELECT 
    b.id as [ID]
    ,d.[Title] as [Title]
    ,e.Class as [Class]

    -- SUM Functions

FROM 
    tb1 a
INNER JOIN 
    tb2 b on a.id=b.fid and a.col3 = b.col4
INNER JOIN 
    tb3 c on b.fid = c.col5
INNER JOIN       
    tb4 d on c.id = d.col6
INNER JOIN 
    #Temp e on c.col7 = e.id
GROUP BY
    b.id, d.Title, e.Class

实际上,临时表是SQL Server上临时存在的表。临时表对于存储多次访问的即时结果集很有用。您可以在这里https://www.sqlservertutorial.net/sql-server-basics/sql-server-temporary-tables/了解更多信息, 并在这里https://codingsight.com/introduction-to-temporary-tables-in -sql服务器/

此外,我强烈建议,如果你正在使用存储过程,设置NOCOUNTON,它也可以提供一个显著的性能提升,因为网络流量大大降低:

SET NOCOUNT ON
SELECT *
INTO #Temp
-- The rest of code

基于

SET NOCOUNT ON是set语句,可防止显示受T-SQL查询语句影响的行数的消息。这在存储过程和触发器中使用,以避免显示受影响的行消息。在存储过程中使用SET NOCOUNT ON可以显着提高存储过程的性能。


1
您能解释一下为什么将整个副本复制tb5#Temp表并加入temp表比tb5直接加入要快吗?确实它们包含相同的数据(#Temp如果存在于中,则可能会丢失索引tb5)。我真的不明白为什么这样做会更有效(因为我所知道的所有数据复制联接的效率都应该较低)。
zig

2
@zig在这种情况下,您是正确的,但是如果该tb5地址位于另一台服务器上呢?在这种情况下,使用临时表绝对比直接连接到另一台服务器要快。那只是测试的建议,看看是否有任何更改我过去有过类似的情况,而且幸运的是,临时表在这种情况下也对OP有所帮助。
Salah Akbari

2

最好的方法是插入到表变量/哈希表中(如果行数很小,则使用表变量;如果行数很大,则使用哈希表)。然后更新聚合,然后最后从表变量或哈希表中进行选择。调查查询计划是必要的。

DECLARE @MYTABLE TABLE (ID INT, [Title] VARCHAR(500), [Class] VARCHAR(500),
[Current - Last 30 Days Col1] INT, [Current - Last 30 Days Col2] INT,
[Current - Last 90 Days Col1] INT,[Current - Last 90 Days Col2] INT,
[Current - Last 365 Days Col1] INT, [Current - Last 365 Days Col2] INT,
[Last year - Last 30 Days Col1] INT, [Last year - Last 30 Days Col2] INT,
[Last year - Last 90 Days Col1] INT, [Last year - Last 90 Days Col2] INT,
[Last year - Last 365 Days Col1] INT, [Last year - Last 365 Days Col2] INT)



INSERT INTO @MYTABLE(ID, [Title],[Class], 
[Current - Last 30 Days Col1], [Current - Last 30 Days Col2],
[Current - Last 90 Days Col1], [Current - Last 90 Days Col2],
[Current - Last 365 Days Col1], [Current - Last 365 Days Col2],
[Last year - Last 30 Days Col1], [Last year - Last 30 Days Col2],
[Last year - Last 90 Days Col1], [Last year - Last 90 Days Col2],
[Last year - Last 365 Days Col1], [Last year - Last 365 Days Col2]
  )
SELECT    b.id  ,d.[Title] ,e.Class ,0,0,0,0,0,0,0,0,0,0,0,0        
FROM     tb1 a
INNER JOIN   tb2 b on a.id=b.fid and a.col3 = b.col4
INNER JOIN   tb3 c on b.fid = c.col5
INNER JOIN   tb4 d on c.id = d.col6
INNER JOIN  tb5 e on c.col7 = e.id
GROUP BY b.id, d.Title, e.Class

UPDATE T 
SET [Current - Last 30 Days Col1]=K.[Current - Last 30 Days Col1] , 
[Current - Last 30 Days Col2]    =K.[Current - Last 30 Days Col2],
[Current - Last 90 Days Col1]    = K.[Current - Last 90 Days Col1], 
[Current - Last 90 Days Col2]    =K.[Current - Last 90 Days Col2] ,
[Current - Last 365 Days Col1]   =K.[Current - Last 365 Days Col1], 
[Current - Last 365 Days Col2]   =K.[Current - Last 365 Days Col2],
[Last year - Last 30 Days Col1]  =K.[Last year - Last 30 Days Col1],
 [Last year - Last 30 Days Col2] =K.[Last year - Last 30 Days Col2],
[Last year - Last 90 Days Col1]  =K.[Last year - Last 90 Days Col1], 
[Last year - Last 90 Days Col2]  =K.[Last year - Last 90 Days Col2],
[Last year - Last 365 Days Col1] =K.[Last year - Last 365 Days Col1],
 [Last year - Last 365 Days Col2]=K.[Last year - Last 365 Days Col2]
    FROM @MYTABLE T JOIN 
     (
SELECT 
    b.id as [ID]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Current - Last 30 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Current - Last 30 Days Col2]

    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Current - Last 90 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Current - Last 90 Days Col2]

    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Current - Last 365 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Current - Last 365 Days Col2]

    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Last year - Last 30 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Last year - Last 30 Days Col2]

    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Last year - Last 90 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Last year - Last 90 Days Col2]

    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END),0) as [Last year - Last 365 Days Col1]
    ,ISNULL(Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END),0) as [Last year - Last 365 Days Col2]
    FROM     tb1 a
INNER JOIN   tb2 b on a.id=b.fid and a.col3 = b.col4
INNER JOIN   tb3 c on b.fid = c.col5
INNER JOIN   tb4 d on c.id = d.col6
INNER JOIN  tb5 e on c.col7 = e.id
GROUP BY    b.id
) AS K ON T.ID=K.ID


SELECT *
FROM @MYTABLE

0

我假设tb1是一个大表(相对于tb2,tb3,tb4和tb5)。

如果是这样,在这里限制该表的选择(使用WHERE子句)是有意义的。

如果仅使用tb1的一小部分,例如,因为与tb2,tb3,tb4和tb5的联接将所需的行减少到百分之几,那么您应该检查表是否在联接中使用的列上建立了索引。

如果使用了tb1的大部分,那么在将其结果与tb2,tb3,tb4和tb5结合之前将结果分组是有意义的。下面是一个例子。

SELECT 
    b.id as [ID]
    ,d.[Title] as [Title]
    ,e.Class as [Class]
    ,SUM(a.[Current - Last 30 Days Col1]) AS [Current - Last 30 Days Col1]
    ,SUM(a.[Current - Last 30 Days Col2]) AS [Current - Last 30 Days Col2]
    ,SUM(a.[Current - Last 90 Days Col1]) AS [Current - Last 90 Days Col1]
    -- etc.
    FROM (
      SELECT a.id, a.col3

      ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 30 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 30 Days Col2]

      ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 90 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 90 Days Col2]

      ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Current - Last 365 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Current - Last 365 Days Col2]

      ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 30 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(MONTH,-13,GETDATE()) and a.DateCol <= DATEADD(MONTH,-12,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 30 Days Col2]

      ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 90 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(QUARTER,-5,GETDATE()) and a.DateCol <= DATEADD(QUARTER,-4,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 90 Days Col2]

      ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col1 ELSE 0 END) as [Last year - Last 365 Days Col1]
      ,Sum(CASE WHEN a.DateCol >= DATEADD(YEAR,-2,GETDATE()) and a.DateCol <= DATEADD(YEAR,-1,GETDATE()) THEN a.col2 ELSE 0 END) as [Last year - Last 365 Days Col2]

      FROM  tb1 a
      WHERE a.DateCol >= DATEADD(YEAR,-2,GETDATE())
      GROUP BY a.id, a.col3
    ) AS a
INNER JOIN 
    tb2 b on a.id=b.fid and a.col3 = b.col4
INNER JOIN 
    tb3 c on b.fid = c.col5
INNER JOIN       
    tb4 d on c.id = d.col6
INNER JOIN 
    tb5 e on c.col7 = e.id
GROUP BY
    b.id, d.Title, e.Class

最好先查看执行计划,然后再决定创建索引和重新创建统计信息。
SQL警察

我真的很讨厌我的帖子得到负分而没有任何解释。我当然同意,要深入了解性能问题,就必须检查执行计划。话虽如此,我还是建议您检查索引中与查询相关的任何外键。
Gert-Jan

1
您在不知道的情况下“假设”某些东西。因此,您根据未知原因发布答案。因此请投票。最好通过发布执行计划来指导OP改善他的问题。
SQL警察

那不是我写的全部。就个人而言,我只会在答案是对还是错的时候投下反对票,而在我完全不同意的时候不会。但感谢您的回应。
Gert-Jan

在某种程度上,这是错误的,因为您如何证明它是正确的?
SQL Police


0

为了优化此类计算,您可以考虑预先计算一些值。预计算的想法是减少需要读取或继续进行的行数。

实现此目的的一种方法是使用索引视图,并让引擎自行执行计算。由于这种类型的视图具有某些局限性,因此您最终将创建一个简单的表并执行计算。基本上,这取决于业务需求。

因此,在下面的示例中,我将创建一个具有RowIDRowDatetime列的表格,并插入100万行。我正在使用索引视图来每天计算实体,因此与其每年查询一百万行,我每年将查询365行以计算这些指标。

DROP TABLE IF EXISTS [dbo].[DataSource];
GO

CREATE TABLE [dbo].[DataSource]
(
    [RowID] BIGINT IDENTITY(1,1) PRIMARY KEY
   ,[RowDateTime] DATETIME2
);

GO

DROP VIEW IF EXISTS [dbo].[vw_DataSource];
GO

CREATE VIEW [dbo].[vw_DataSource] WITH SCHEMABINDING
AS
SELECT YEAR([RowDateTime]) AS [Year]
      ,MONTH([RowDateTime]) AS [Month]
      ,DAY([RowDateTime]) AS [Day]
      ,COUNT_BIG(*) AS [Count]
FROM [dbo].[DataSource]
GROUP BY YEAR([RowDateTime])
        ,MONTH([RowDateTime])
        ,DAY([RowDateTime]);
GO

CREATE UNIQUE CLUSTERED INDEX [IX_vw_DataSource] ON [dbo].[vw_DataSource]
(
    [Year] ASC,
    [Month] ASC,
    [Day] ASC
);

GO

DECLARE @min bigint, @max bigint
SELECT @Min=1 ,@Max=1000000

INSERT INTO [dbo].[DataSource] ([RowDateTime])
SELECT TOP (@Max-@Min+1) DATEFROMPARTS(2019,  1.0 + floor(12 * RAND(convert(varbinary, newid()))), 1.0 + floor(28 * RAND(convert(varbinary, newid())))          )       
FROM master..spt_values t1 
CROSS JOIN master..spt_values t2

GO


SELECT *
FROM [dbo].[vw_DataSource]


SELECT SUM(CASE WHEN DATEFROMPARTS([Year], [Month], [Day]) >= DATEADD(MONTH,-1,GETDATE()) THEN [Count] ELSE 0 END) as [Current - Last 30 Days Col1]
      ,SUM(CASE WHEN DATEFROMPARTS([Year], [Month], [Day]) >= DATEADD(QUARTER,-1,GETDATE()) THEN [Count] ELSE 0 END) as [Current - Last 90 Days Col1]
      ,SUM(CASE WHEN DATEFROMPARTS([Year], [Month], [Day]) >= DATEADD(YEAR,-1,GETDATE()) THEN [Count] ELSE 0 END) as [Current - Last 365 Days Col1]
FROM [dbo].[vw_DataSource];

这种解决方案的成功很大程度上取决于数据的分布方式以及您拥有多少行。例如,如果一年中的每一天每天都有一个条目,则视图和表将具有相同的行匹配项,因此不会减少I / O操作。

同样,以上只是实现数据并读取数据的示例。您可能需要在视图定义中添加更多列。


0

我将使用查找表“ Dates”表将我的数据与DatesId上的索引连接在一起。当我要浏览历史数据时,我将日期用作过滤器。联接是快速的,因此,由于DatesId是聚簇的主索引(主键),因此可以进行筛选。还要为数据表添加日期列(包括在内的列)。

日期表具有以下列:

DatesId,日期,年,季度,年季度,MonthNum,MonthNameShort,YearWeek,WeekNum,DayOfYear,DayOfMonth,DayNumOfWeek,DayName

示例数据:20310409 2031-04-09 2031 2 2031-Q2 4月4日2031_15 15 99 9 3星期三

如果您想要这样的csv,可以将其导入到数据库中,但是我敢肯定,您可以轻松地在网上找到类似的东西并自己制作。

我还添加了一个标识列,以便您可以为每个日期获取一个整数。这使得使用起来更容易一些,但不是必需的。

SELECT * FROM dbo.dates where dateIndex BETWEEN (getDateIndexDate(getDate())-30 AND getDateIndexDate(getDate())+0) --30 days ago

这使我可以轻松跳回特定时间段。在此上创建自己的视图非常容易。当然,您也可以使用ROW_NUMBER()函数进行数年,数周等操作。

有了所需的日期范围后,就可以加入数据了。工作非常快!


0

由于您始终将值归类为整数,因此我将首先在from子句的子查询中按月分组。这类似于使用临时表。不确定这是否会真正加快查询速度。

SELECT f.id, f.[Title], f.Class,
    SUM(CASE WHEN f.MonthDiff = 1 THEN col1 ELSE 0 END) as [Current - Last 30 Days Col1],
    -- etc
FROM (
    SELECT 
        b.id,
        d.[Title],
        e.Class,
        DateDiff(Month, a.DateCol, GETDATE()) as MonthDiff,
        Sum(a.col1) as col1,
        Sum(a.col2) as col2
    FROM  tb1 a
    INNER JOIN tb2 b on a.id = b.fid and a.col3 = b.col4
    INNER JOIN tb3 c on b.fid = c.col5
    INNER JOIN tb4 d on c.id = d.col6
    INNER JOIN tb5 e on c.col7 = e.id
    WHERE a.DateCol between DATEADD(YEAR,-2,GETDATE() and GETDATE()
    GROUP BY b.id, d.Title, e.Class, DateDiff(Month,  a.DateCol, GETDATE())
) f
group by f.id, f.[Title], f.Class

-2

为了提高SQL查询的速度,必须添加索引。对于每个联接表,您必须添加一个索引。

像下面的oracle代码示例:

CREATE INDEX supplier_idx
ON supplier (supplier_name);

这不是一个坏建议。您会从OP中看到没有索引的临时表-c.col7 = e.id上的INNER JOIN #Temp e。虽然答案中还有改进的余地,但我认为不应将其大为否决。特别是对于新用户。
smoore4

@ smoore4同意,应该删除没有明确论点的投票选项。这个功能有一个很大的误用
Greggz
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.