当包裹在TVF中时,为什么此查询的速度大大降低?


17

我有一个相当复杂的查询,它仅需几秒钟即可运行,但是当包装到一个表值函数中时,它要慢得多。我实际上并没有完成它,但是它运行了十分钟而没有结束。唯一的变化是用日期参数替换了两个日期变量(用日期文字初始化):

七秒钟内运行

DECLARE @StartDate DATE = '2011-05-21'
DECLARE @EndDate   DATE = '2011-05-23'

DECLARE @Data TABLE (...)
INSERT INTO @Data(...) SELECT...

SELECT * FROM @Data

至少跑十分钟

CREATE FUNCTION X (@StartDate DATE, @EndDate DATE)
  RETURNS TABLE AS RETURN
  SELECT ...

SELECT * FROM X ('2011-05-21', '2011-05-23')

之前,我已经使用带有RETURNS @Data TABLE(...)子句的多语句TVF编写了该函数,但是将其替换为内联结构并没有进行明显的更改。TVF的长时间运行是实际SELECT * FROM X时间;实际上创建UDF只需几秒钟。

我可以发布有问题的查询,但是它有点长(〜165行),并且根据第一种方法的成功,我怀疑还有其他情况。浏览执行计划,它们看起来是相同的。

我尝试将查询分成较小的部分,而不进行更改。单独执行时,单个部分的花费不会超过几秒钟,但是TVF仍然挂起。

我看到了一个非常类似的问题,/programming/4190506/sql-server-2005-table-valued-function-weird-performance,但是我不确定该解决方案是否适用。也许有人看到了这个问题并知道了更通用的解决方案?谢谢!

经过几分钟的处理,这是dm_exec_requests:

session_id              59
request_id              0
start_time              40688.46517
status                  running
command                 UPDATE
sql_handle              0x030015002D21AF39242A1101ED9E00000000000000000000
statement_start_offset  10962
statement_end_offset    16012
plan_handle             0x050015002D21AF3940C1E6B0040000000000000000000000
database_id                 21
user_id                 1
connection_id           314AE0E4-A1FB-4602-BF40-02D857BAD6CF
blocking_session_id         0
wait_type               NULL
wait_time                   0
last_wait_type          SOS_SCHEDULER_YIELD
wait_resource   
open_transaction_count  0
open_resultset_count    1
transaction_id              48030651
context_info            0x
percent_complete        0
estimated_completion_time   0
cpu_time                    344777
total_elapsed_time          348632
scheduler_id            7
task_address            0x000000045FC85048
reads                   1549
writes                  13
logical_reads           30331425
text_size               2147483647
language                us_english
date_format             mdy
date_first              7
quoted_identifier           1
arithabort              1
ansi_null_dflt_on       1
ansi_defaults           0
ansi_warnings           1
ansi_padding            1
ansi_nulls                  1
concat_null_yields_null 1
transaction_isolation_level 2
lock_timeout            -1
deadlock_priority           0
row_count                   105
prev_error              0
nest_level              1
granted_query_memory    170
executing_managed_code  0
group_id                2
query_hash              0xBE6A286546AF62FC
query_plan_hash         0xD07630B947043AF0

这是完整的查询:

CREATE FUNCTION Routine.MarketingDashboardECommerceBase (@StartDate DATE, @EndDate DATE)
RETURNS TABLE AS RETURN
    WITH RegionsByCode AS (SELECT CountryCode, MIN(Region) AS Region FROM Staging.Volusion.MarketingRegions GROUP BY CountryCode)
        SELECT
            D.Date, Div.Division, Region.Region, C.Category1, C.Category2, C.Category3,
            COALESCE(V.Visits,          0) AS Visits,
            COALESCE(Dem.Demos,         0) AS Demos,
            COALESCE(S.GrossStores,     0) AS GrossStores,
            COALESCE(S.PaidStores,      0) AS PaidStores,
            COALESCE(S.NetStores,       0) AS NetStores,
            COALESCE(S.StoresActiveNow, 0) AS StoresActiveNow
            -- This line causes the run time to climb from a few seconds to over an hour!
            --COALESCE(V.Visits,          0) * COALESCE(ACS.AvgClickCost, GAAC.AvgAdCost, 0.00) AS TotalAdCost
            -- This line alone does not inflate the run time
            --ACS.AvgClickCost
            -- This line is enough to increase the run time to at least a couple minutes
            --GAAC.AvgAdCost
        FROM
            --Dates AS D
            (SELECT SQLDate AS Date FROM Dates WHERE SQLDate BETWEEN @StartDate AND @EndDate) AS D
            CROSS JOIN (SELECT 'UK' AS Division UNION SELECT 'US' UNION SELECT 'IN' UNION SELECT 'Unknown') AS Div
            CROSS JOIN (SELECT Category1, Category2, Category3 FROM Routine.MarketingDashboardCampaignMap UNION SELECT 'Unknown', 'Unknown', 'Unknown') AS C
            CROSS JOIN (SELECT DISTINCT Region FROM Staging.Volusion.MarketingRegions) AS Region
            -- Visitors
            LEFT JOIN
                (
                SELECT
                    V.Date,
                    CASE    WHEN V.Country IN ('United Kingdom', 'Guernsey', 'Ireland', 'Jersey') THEN 'UK'
                        WHEN V.Country IN ('United States', 'Canada', 'Puerto Rico', 'U.S. Virgin Islands') THEN 'US'
                        ELSE 'IN' END AS Division,
                    COALESCE(MR.Region, 'Unknown') AS Region,
                    C.Category1, C.Category2, C.Category3,
                    SUM(V.Visits) AS Visits
                FROM
                             RawData.GoogleAnalytics.Visits        AS V
                    INNER JOIN Routine.MarketingDashboardCampaignMap AS C ON V.LandingPage = C.LandingPage AND V.Campaign = C.Campaign AND V.Medium = C.Medium AND V.Referrer = C.Referrer AND V.Source = C.Source
                    LEFT JOIN  Staging.Volusion.MarketingRegions     AS MR ON V.Country = MR.CountryName
                WHERE
                    V.Date BETWEEN @StartDate AND @EndDate
                GROUP BY
                    V.Date,
                    CASE    WHEN V.Country IN ('United Kingdom', 'Guernsey', 'Ireland', 'Jersey') THEN 'UK'
                        WHEN V.Country IN ('United States', 'Canada', 'Puerto Rico', 'U.S. Virgin Islands') THEN 'US'
                        ELSE 'IN' END,
                    COALESCE(MR.Region, 'Unknown'), C.Category1, C.Category2, C.Category3
                ) AS V ON D.Date = V.Date AND Div.Division = V.Division AND Region.Region = V.Region AND C.Category1 = V.Category1 AND C.Category2 = V.Category2 AND C.Category3 = V.Category3
            -- Demos
            LEFT JOIN
                (
                SELECT
                    OD.SQLDate,
                    G.Division,
                    COALESCE(MR.Region,   'Unknown') AS Region,
                    COALESCE(C.Category1, 'Unknown') AS Category1,
                    COALESCE(C.Category2, 'Unknown') AS Category2,
                    COALESCE(C.Category3, 'Unknown') AS Category3,
                    SUM(D.Demos) AS Demos
                FROM
                             Demos            AS D
                    INNER JOIN Orders           AS O  ON D."Order" = O."Order"
                    INNER JOIN Dates            AS OD ON O.OrderDate = OD.DateSerial
                    INNER JOIN MarketingSources AS MS ON D.Source = MS.Source
                    LEFT JOIN  RegionsByCode    AS MR ON MS.CountryCode = MR.CountryCode
                    LEFT JOIN
                        (
                        SELECT
                            G.TransactionID,
                            MIN (
                                CASE WHEN G.Country IN ('United Kingdom', 'Guernsey', 'Ireland', 'Jersey') THEN 'UK'
                                    WHEN G.Country IN ('United States', 'Canada', 'Puerto Rico', 'U.S. Virgin Islands') THEN 'US'
                                    ELSE 'IN' END
                                ) AS Division
                        FROM
                            RawData.GoogleAnalytics.Geography AS G
                        WHERE
                                TransactionDate BETWEEN @StartDate AND @EndDate
                            AND NOT EXISTS (SELECT * FROM RawData.GoogleAnalytics.Geography AS G2 WHERE G.TransactionID = G2.TransactionID AND G2.EffectiveDate > G.EffectiveDate)
                        GROUP BY
                            G.TransactionID
                        ) AS G  ON O.VolusionOrderID = G.TransactionID
                    LEFT JOIN  RawData.GoogleAnalytics.Referrers     AS R  ON O.VolusionOrderID = R.TransactionID AND NOT EXISTS (SELECT * FROM RawData.GoogleAnalytics.Referrers AS R2 WHERE R.TransactionID = R2.TransactionID AND R2.EffectiveDate > R.EffectiveDate)
                    LEFT JOIN  Routine.MarketingDashboardCampaignMap AS C  ON MS.LandingPage = C.LandingPage AND MS.Campaign = C.Campaign AND MS.Medium = C.Medium AND COALESCE(R.ReferralPath, '(not set)') = C.Referrer AND MS.SourceName = C.Source
                WHERE
                        O.IsDeleted = 'No'
                    AND OD.SQLDate BETWEEN @StartDate AND @EndDate
                GROUP BY
                    OD.SQLDate,
                    G.Division,
                    COALESCE(MR.Region,   'Unknown'),
                    COALESCE(C.Category1, 'Unknown'),
                    COALESCE(C.Category2, 'Unknown'),
                    COALESCE(C.Category3, 'Unknown')
                ) AS Dem ON D.Date = Dem.SQLDate AND Div.Division = Dem.Division AND Region.Region = Dem.Region AND C.Category1 = Dem.Category1 AND C.Category2 = Dem.Category2 AND C.Category3 = Dem.Category3
            -- Stores
            LEFT JOIN
                (
                SELECT
                    OD.SQLDate,
                    CASE WHEN O.VolusionCountryCode = 'GB' THEN 'UK'
                        WHEN A.CountryShortName IN ('U.S.', 'Canada', 'Puerto Rico', 'U.S. Virgin Islands') THEN 'US'
                        ELSE 'IN' END AS Division,
                    COALESCE(MR.Region,     'Unknown') AS Region,
                    COALESCE(CpM.Category1, 'Unknown') AS Category1,
                    COALESCE(CpM.Category2, 'Unknown') AS Category2,
                    COALESCE(CpM.Category3, 'Unknown') AS Category3,
                    SUM(S.Stores) AS GrossStores,
                    SUM(CASE WHEN O.DatePaid <> -1 THEN 1 ELSE 0 END) AS PaidStores,
                    SUM(CASE WHEN O.DatePaid <> -1 AND CD.WeekEnding <> OD.WeekEnding THEN 1 ELSE 0 END) AS NetStores,
                    SUM(CASE WHEN O.DatePaid <> -1 THEN SH.ActiveStores ELSE 0 END) AS StoresActiveNow
                FROM
                             Stores           AS S
                    INNER JOIN Orders           AS O   ON S."Order" = O."Order"
                    INNER JOIN Dates            AS OD  ON O.OrderDate = OD.DateSerial
                    INNER JOIN Dates            AS CD  ON O.CancellationDate = CD.DateSerial
                    INNER JOIN Customers        AS C   ON O.CustomerNow = C.Customer
                    INNER JOIN MarketingSources AS MS  ON C.Source = MS.Source
                    INNER JOIN StoreHistory     AS SH  ON S.MostRecentHistory = SH.History
                    INNER JOIN Addresses        AS A   ON C.Address = A.Address
                    LEFT JOIN  RegionsByCode    AS MR  ON MS.CountryCode = MR.CountryCode
                    LEFT JOIN  Routine.MarketingDashboardCampaignMap AS CpM ON CpM.LandingPage = 'N/A' AND MS.Campaign = CpM.Campaign AND MS.Medium = CpM.Medium AND CpM.Referrer = 'N/A' AND MS.SourceName = CpM.Source
                WHERE
                        O.IsDeleted = 'No'
                    AND OD.SQLDate BETWEEN @StartDate AND @EndDate
                GROUP BY
                    OD.SQLDate,
                    CASE WHEN O.VolusionCountryCode = 'GB' THEN 'UK'
                        WHEN A.CountryShortName IN ('U.S.', 'Canada', 'Puerto Rico', 'U.S. Virgin Islands') THEN 'US'
                        ELSE 'IN' END,
                    COALESCE(MR.Region,     'Unknown'),
                    COALESCE(CpM.Category1, 'Unknown'),
                    COALESCE(CpM.Category2, 'Unknown'),
                    COALESCE(CpM.Category3, 'Unknown')
                ) AS S ON D.Date = S.SQLDate AND Div.Division = S.Division AND Region.Region = S.Region AND C.Category1 = S.Category1 AND C.Category2 = S.Category2 AND C.Category3 = S.Category3
            -- Google Analytics spend
            LEFT JOIN
                (
                SELECT
                    AC.Date, C.Category1, C.Category2, C.Category3, SUM(AC.AdCost) / SUM(AC.Visits) AS AvgAdCost
                FROM
                    RawData.GoogleAnalytics.AdCosts AS AC
                    INNER JOIN
                        (
                        SELECT Campaign, Medium, Source, MIN(Category1) AS Category1, MIN(Category2) AS Category2, MIN(Category3) AS Category3
                        FROM Routine.MarketingDashboardCampaignMap
                        WHERE Category1 <> 'Affiliate'
                        GROUP BY Campaign, Medium, Source
                        ) AS C ON AC.Campaign = C.Campaign AND AC.Medium = C.Medium AND AC.Source = C.Source
                WHERE
                    AC.Date BETWEEN @StartDate AND @EndDate
                GROUP BY
                    AC.Date, C.Category1, C.Category2, C.Category3
                HAVING
                    SUM(AC.AdCost) > 0.00 AND SUM(AC.Visits) > 0
                ) AS GAAC ON D.Date = GAAC.Date AND C.Category1 = GAAC.Category1 AND C.Category2 = GAAC.Category2 AND C.Category3 = GAAC.Category3
            -- adCenter spend
            LEFT JOIN
                (
                SELECT Date, SUM(Spend) / SUM(Clicks) AS AvgClickCost
                FROM RawData.AdCenter.Spend
                WHERE Date BETWEEN @StartDate AND @EndDate
                GROUP BY Date
                HAVING SUM(Spend) > 0.00 AND SUM(Clicks) > 0
                ) AS ACS ON D.Date = ACS.Date AND C.Category1 = 'PPC' AND C.Category2 = 'adCenter' AND C.Category3 = 'N/A'
        WHERE
            V.Visits > 0 OR Dem.Demos > 0 OR S.GrossStores > 0
GO


SELECT * FROM Routine.MarketingDashboardECommerceBase('2011-05-21', '2011-05-23')

您能告诉我们文本查询计划吗?而在第一次查询,是什么类型的@StartDate + @EndDate
GBN

@gbn:抱歉,该计划太长,大约32K个字符。是否有一些最有用的子集?另外,您是否更喜欢独立查询或TVF的计划?
所有行业的乔恩

在查询的TVF表单上运行执行计划不会返回任何有用的信息,因此我假设您正在寻找非TVF版本的查询计划。还是有某种方法可以达到TVF实际使用的执行计划?
所有行业的乔恩

没有等待任务。我对dm_exec_requests不熟悉,但是我在TVF执行的五分钟后附加了输出。
所有行业的乔恩

@马丁:是的;独立查询的CPU时间为7021(部分 TVF版本的2%)和154K逻辑读取(0.5%)。我最近退出了TVF版本,并在27分钟后完成。因此,它肯定在搅动更多的数据……但是,如何才能使用更好的计划呢?我将详细研究良好的执行计划,并查看一些提示是否有帮助。
所有行业的乔恩

Answers:


3

我将问题隔离到查询中的一行。请记住,查询的行长为160行,并且如果我从SELECT子句中禁用此行,我都会以任何一种方式包括相关表:

COALESCE(V.Visits, 0) * COALESCE(ACS.AvgClickCost, GAAC.AvgAdCost, 0.00)

...运行时间从63分钟减少到5秒(内联CTE使它比原始的7秒查询要快一些)。包括ACS.AvgClickCostGAAC.AvgAdCost导致运行时间爆炸。尤其奇怪的是,这些字段来自两个子查询,分别具有十行和三行!它们各自独立运行时都在零秒内运行,并且行数是如此之短,即使使用嵌套循环,我也希望联接时间变得微不足道。

关于为什么这种看似无害的计算为什么会作为独立查询运行得很快却完全抛弃TVF的猜测呢?


我已经发布了查询,但是正如您所看到的,它使用了十几个表,包括一些视图和另一个TVF,所以我担心它不会有所帮助。我不了解的部分是如何在TVF中包装查询如何将运行时间乘以750。只有在我包含GAAC.AvgAdCost(今天;昨天ACS.AvgClickCost也是一个问题)的情况下,这种情况才会发生,因此子查询似乎无法执行计划。
所有行业的乔恩

1
我猜您必须查看join子查询。如果在任何表之间有多对多关系,则处理的记录将多10倍。

在我们的项目(有许多嵌套视图和内联TVFs)某些时候,我们发现自己更换COALESCE()ISNULL()以帮助更好地查询优化计划草案。我认为这与ISNULL()输出类型比更具可预测性有关COALESCE()。值得一试?我知道这很模糊,但是以我们有限的经验来看,影响查询优化器以制定更好的计划似乎是一种模糊的艺术,因此,绝望地尝试一堆模糊的疯狂想法是我们取得进展的唯一方法。

2

我希望这与参数嗅探有关。

这里有一些关于问题的讨论(您可以在SO中搜索参数嗅探。)

http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx


内联TVF不会引起参数嗅探:它们只是像视图一样展开的宏。
gbn

@gbn:TVF本身确实像宏一样被扩展了,但是(据我所知)最终执行该扩展的查询或存储过程需要进行计划和潜在的参数化。(我们曾在SQL Server 2005中对此进行过战斗。直到我们发现SQL Server Management Studio使用与ARITHABORTReporting Services和/或jTDS 不同的会话设置(也许是?)之前,战斗特别困难,因此有时会想出其中一个“坏”的计划,但其他人会(僵硬)“在同一查询”我不害怕)。

闻起来像在嗅我
霍根

嗯,要做很多阅读。就其价值而言,参数化值的基数没有太大区别:查询包括一个Dates表,每个日期只有一行,而其他几个表的每个日期都有很多行,但是对于任何给定的日期,该表的数量几乎相同。我在(重新)创建UDF之后立即在测试执行中使用相同的参数(05/21至05/23),因此,如果有的话,应该为这些值“准备好”。
所有行业的乔恩

还有一点需要注意:如Jetson在stackoverflow.com/questions/211355/…中所述,将参数的值分配给局部变量不会产生实质性影响。
所有行业的乔恩

1

不幸的是,SQL的查询优化引擎看不到内部函数。

因此,我将使用快速计划中的执行计划来找出适用于TF的提示。漂洗并重复直到TF的执行计划接近更快的执行计划为止。

http://sqlblog.com/blogs/tibor_karaszi/archive/2008/08/29/execution-plan-re-use-sp-executesql-and-tsql-variables.aspx


2
SQL Server查询优化器可以看到ITVF(内联表值函数)的内部,而看不到其他任何内部函数。

注意:正确设计交叉应用的内联表函数可以极大地提高性能。例如,连接上的不可保留表达式(如您的合并)可以包装在一条apply语句中,作为一个集合进行评估,然后在下一个查询中加入它,而不会成为RBAR。做一点实验。交叉申请很难掌握,但值得!
SheldonH 2015年

0

请问这些值有什么区别?

arithabort              1
ansi_null_dflt_on       1
ansi_defaults           0
ansi_warnings           1
ansi_padding            1
ansi_nulls              1

这些(特别是arithabort)已显示以这种方式严重影响查询性能。


这是因为它是计划缓存键,而不是有关arithabort其本身的任何内容,不是吗?从SQL Server 2005开始,我认为该设置只要ansi_warnings启用就不会起作用。(如果设置不正确,则在2000年将不会使用索引视图)
Martin Smith

@马丁:我对此没有直接的经验,但最近回想起阅读的东西。并找到一些答案。它可以帮助OP,也可以不...编辑:sqlblog.com/blogs/kalen_delaney/archive/2008/06/19/... 感叹
GBN

我已经读过类似的关于SO的明确声明。我从来没有见过任何东西能让我自己重现它,也没有任何逻辑上的解释说明为什么arithabort设置应该对性能产生如此巨大的影响,因此我对此持怀疑态度。
马丁·史密斯

ARITHABORT,ANSI_WARNINGS,ANSI_PADDING和ANSI_NULL为1,其余均为NULL。
所有行业的乔恩

仅供参考,我完全在SSMS中工作,因此VS或其他客户端中的不同设置没有问题。
所有行业的乔恩
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.