是否有任何有关SQL Server 2016更改的文档或研究,以了解如何估计包含SUBSTRING()或其他字符串函数的谓词的基数?
我要问的原因是我正在查看一个在兼容模式130下性能下降的查询,原因与与包含调用SUBSTRING()的WHERE子句匹配的行数估计值的更改有关。我已通过查询重写纠正了此问题,但我想知道是否有人知道SQL Server 2016中有关此区域更改的任何文档。
演示代码如下。在此测试用例中,估计值非常接近,但准确性取决于数据。
在测试用例中,在兼容级别120中,SQL Server似乎正在使用直方图进行估计,而在兼容级别130中,SQL Server似乎假定表中有固定的10%匹配。
CREATE DATABASE MyStringTestDB;
GO
USE MyStringTestDB;
GO
DROP TABLE IF EXISTS dbo.StringTest;
CREATE TABLE dbo.StringTest ( [TheString] varchar(15) );
GO
INSERT INTO dbo.StringTest
VALUES
( 'Y5_CLV' );
INSERT INTO dbo.StringTest
VALUES
( 'Y5_EG3' );
INSERT INTO dbo.StringTest
VALUES
( 'ZY_NE' );
INSERT INTO dbo.StringTest
VALUES
( 'ZY_PQT' );
INSERT INTO dbo.StringTest
VALUES
( 'ZY_T2V' );
INSERT INTO dbo.StringTest
VALUES
( 'ZY_TT4' );
INSERT INTO dbo.StringTest
VALUES
( 'ZY_ZKK' );
INSERT INTO dbo.StringTest
VALUES
( 'ZZ_LW6' );
INSERT INTO dbo.StringTest
VALUES
( 'ZZ_QO3' );
INSERT INTO dbo.StringTest
VALUES
( 'ZZ_TZ7' );
INSERT INTO dbo.StringTest
VALUES
( 'ZZ_UZZ' );
CREATE CLUSTERED INDEX IX_Clustered ON dbo.StringTest (TheString);
/*
Uses fixed % for estimate; 1.1 rows estimated in this case.
Plan for computation:
CSelCalcFixedFilter (0.1) <----
Selectivity: 0.1
*/
ALTER DATABASE MyStringTestDB SET compatibility_level = 130;
GO
SELECT *
FROM dbo.StringTest
WHERE SUBSTRING(TheString, 1, CHARINDEX('_',TheString) - 1) = 'ZZ'
OPTION (QUERYTRACEON 2363, QUERYTRACEON 3604);
/*
Uses histogram to get estimate of 1
CSelCalcPointPredsFreqBased <----
Distinct value calculation:
CDVCPlanLeaf
0 Multi-Column Stats, 1 Single-Column Stats, 0 Guesses
Individual selectivity calculations:
(none)
Loaded histogram for column QCOL: [DBA].[dbo].[StringTest].TheString from stats with id 1
*/
ALTER DATABASE MyStringTestDB SET compatibility_level = 120;
GO
SELECT *
FROM dbo.StringTest
WHERE SUBSTRING(TheString, 1, CHARINDEX('_',TheString) - 1) = 'ZZ'
OPTION (QUERYTRACEON 2363, QUERYTRACEON 3604);
/*
-- Simpler rewrite; works fine in both compat levels and gets better estimate.
SELECT *
FROM dbo.StringTest
WHERE TheString LIKE 'ZZ[_]%'
OPTION (QUERYTRACEON 2363, QUERYTRACEON 3604);
*/
Y5_EG3
字符串只是代码并且始终是大写,那么您可以随时尝试指定二进制排序规则-Latin1_General_100_BIN2
可以提高筛选操作的速度。只需添加COLLATE Latin1_General_100_BIN2
到CREATE TABLE
语句之后varchar(15)
。我很想知道它是否还会影响计划的生成/估算。