在SQL Server 2005上获取最少的多列的最有效方法是什么？

29

我要从6列中获取最小值。

到目前为止，我已经找到了三种方法来实现此目的，但是我对这些方法的性能感到担忧，并且想知道哪种方法对性能更好。

第一种方法是使用大写语句。这是一个包含3列的示例，基于上面链接中的示例。我的案例陈述将更长，因为我将查看6列。

Select Id,
       Case When Col1 <= Col2 And Col1 <= Col3 Then Col1
            When Col2 <= Col3 Then Col2 
            Else Col3
            End As TheMin
From   MyTable

第二种选择是将UNION运算符与多个select语句一起使用。我将其放在接受Id参数的UDF中。

select Id, dbo.GetMinimumFromMyTable(Id)
from MyTable

和

select min(col)
from
(
    select col1 [col] from MyTable where Id = @id
    union all
    select col2 from MyTable where Id = @id
    union all
    select col3 from MyTable where Id = @id
) as t

我发现的第三个选择是使用UNPIVOT运算符，直到现在我都不知道该运算符

with cte (ID, Col1, Col2, Col3)
as
(
    select ID, Col1, Col2, Col3
    from TestTable
)
select cte.ID, Col1, Col2, Col3, TheMin from cte
join
(
    select
        ID, min(Amount) as TheMin
    from 
        cte 
        UNPIVOT (Amount for AmountCol in (Col1, Col2, Col3)) as unpvt
    group by ID
) as minValues
on cte.ID = minValues.ID

由于表的大小以及查询和更新该表的频率，我担心这些查询对数据库的性能影响。

该查询实际上将用于连接具有几百万条记录的表，但是返回的记录一次将减少到约一百条。它将在一天内运行很多次，并且我查询的6列会经常更新（它们包含每日统计信息）。我认为我要查询的6列上没有任何索引。

尝试获取最少的多列时，以下哪种方法对性能更好？还是有我不知道的另一种更好的方法？

我正在使用SQL Server 2005

样本数据和结果

如果我的数据包含这样的记录：

ID Col1 Col2 Col3 Col4 Col5 Col6
1 3 4 0 2 1 5
2 2 6 10 5 7 9
3 1 1 2 3 4 5
4 9 5 4 6 8 9

最终结果应该是

— 瑞秋
source

22

我测试了所有3种方法的性能，这是我发现的结果：

1条记录：无明显差异
10条记录：无明显差异
1,000条记录：无明显差异
10,000条记录：UNION子查询速度稍慢。该CASE WHEN查询比快一点UNPIVOT的。
100,000条记录：UNION子查询的速度明显慢，但UNPIVOT查询的速度却比CASE WHEN查询快一点
500,000条记录：UNION子查询仍然明显慢，但UNPIVOT变得比CASE WHEN查询快得多

所以最终结果似乎是

对于较小的记录集，似乎没有太大的区别。使用最容易阅读和维护的内容。
一旦开始进入更大的记录集，UNION ALL与其他两种方法相比，子查询的性能就会开始下降。
该CASE语句在特定点（以我为例，大约10万行）之前执行最佳，该点成为UNPIVOT查询性能最佳的查询

一个查询变得比另一个查询更好的实际数量可能会因您的硬件，数据库架构，数据和当前服务器负载而发生变化，因此，如果您担心性能，请确保使用自己的系统进行测试。

我还使用Mikael的答案进行了一些测试；但是，对于大多数记录集大小，它比此处尝试的所有其他三种方法都慢。唯一的例外是它比UNION ALL非常大的记录集大小的查询要好。我喜欢这样的事实，它除了显示最小值之外还显示列名称。

我不是dba，所以我可能没有优化测试并错过了一些东西。我正在测试实际的实时数据，因此可能会影响结果。我试图通过在每个不同的时间运行每个查询来解决这个问题，但是您永远都不知道。如果有人对此进行了彻底的测试并分享了他们的结果，我肯定会很感兴趣。

— 瑞秋
source

6

不知道什么是最快的，但是您可以尝试这样的方法。

declare @T table
(
  Col1 int,
  Col2 int,
  Col3 int,
  Col4 int,
  Col5 int,
  Col6 int
)

insert into @T values(1, 2, 3, 4, 5, 6)
insert into @T values(2, 3, 1, 4, 5, 6)

select T4.ColName, T4.ColValue
from @T as T1
  cross apply (
                select T3.ColValue, T3.ColName
                from (
                       select row_number() over(order by T2.ColValue) as rn,
                              T2.ColValue,
                              T2.ColName
                       from (
                              select T1.Col1, 'Col1' union all
                              select T1.Col2, 'Col2' union all
                              select T1.Col3, 'Col3' union all
                              select T1.Col4, 'Col4' union all
                              select T1.Col5, 'Col5' union all
                              select T1.Col6, 'Col6'
                            ) as T2(ColValue, ColName)
                     ) as T3
                where T3.rn = 1
              ) as T4

结果：

ColName ColValue
------- -----------
Col1    1
Col3    1

如果您对哪一列的最小值不感兴趣，可以改用此值。

declare @T table
(
  Id int,
  Col1 int,
  Col2 int,
  Col3 int,
  Col4 int,
  Col5 int,
  Col6 int
)

insert into @T
select 1,        3,       4,       0,       2,       1,       5 union all
select 2,        2,       6,      10,       5,       7,       9 union all
select 3,        1,       1,       2,       3,       4,       5 union all
select 4,        9,       5,       4,       6,       8,       9

select T.Id, (select min(T1.ColValue)
              from (
                      select T.Col1 union all
                      select T.Col2 union all
                      select T.Col3 union all
                      select T.Col4 union all
                      select T.Col5 union all
                      select T.Col6
                    ) as T1(ColValue)
             ) as ColValue
from @T as T

简化的透视查询。

select Id, min(ColValue) as ColValue
from @T
unpivot (ColValue for Col in (Col1, Col2, Col3, Col4, Col5, Col6)) as U
group by Id

— 米凯尔·埃里克森（Mikael Eriksson）
source

6

添加一个持久化的计算列，该列使用CASE语句来执行所需的逻辑。

当您需要基于该值进行联接（或其他任何操作）时，最小值将始终有效。

每当任何源值更改时，该值都会重新计算（INSERT / UPDATE/ MERGE）。我不是说这是必然的工作量的最佳解决方案，我只是提供它作为一个解决方案，就像其他的答案。只有OP才能确定最适合工作负载的。

— 乔恩·西格尔
source

1

6个日期的案情陈述。要减少工作量，请从第一个case语句复制true分支。最糟糕的情况是Date1是最小值，最好的情况是Date6是最小值，因此将最可能的日期放在Date6中。我写这篇文章是因为计算列的限制。

CASE WHEN Date1 IS NULL OR Date1 > Date2 THEN
        CASE WHEN Date2 IS NULL OR Date2 > Date3 THEN
            CASE WHEN Date3 IS NULL OR Date3 > Date4 THEN
                CASE WHEN Date4 IS NULL OR Date4 > Date5 THEN
                    CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                        Date6
                    ELSE
                        Date5
                    END
                ELSE
                    CASE WHEN Date4 IS NULL OR Date4 > Date6 THEN
                        Date6
                    ELSE
                        Date4
                    END
                END
            ELSE
                CASE WHEN Date3 IS NULL OR Date3 > Date5 THEN
                    CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                        Date6
                    ELSE
                        Date5
                    END
                ELSE
                    CASE WHEN Date3 IS NULL OR Date3 > Date6 THEN
                        Date6
                    ELSE
                        Date3
                    END
                END
            END
        ELSE
            CASE WHEN Date2 IS NULL OR Date2 > Date4 THEN
                CASE WHEN Date4 IS NULL OR Date4 > Date5 THEN
                    CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                        Date6
                    ELSE
                        Date5
                    END
                ELSE
                    CASE WHEN Date4 IS NULL OR Date4 > Date5 THEN
                        CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                            Date6
                        ELSE
                            Date5
                        END
                    ELSE
                        CASE WHEN Date4 IS NULL OR Date4 > Date6 THEN
                            Date6
                        ELSE
                            Date4
                        END
                    END
                END
            ELSE
                CASE WHEN Date2 IS NULL OR Date2 > Date5 THEN
                    CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                        Date6
                    ELSE
                        Date5
                    END
                ELSE
                    CASE WHEN Date2 IS NULL OR Date2 > Date6 THEN
                        Date6
                    ELSE
                        Date2
                    END
                END
            END
        END
ELSE
    CASE WHEN Date1 IS NULL OR Date1 > Date3 THEN
        CASE WHEN Date3 IS NULL OR Date3 > Date4 THEN
            CASE WHEN Date4 IS NULL OR Date4 > Date5 THEN
                CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                    Date6
                ELSE
                    Date5
                END
            ELSE
                CASE WHEN Date4 IS NULL OR Date4 > Date6 THEN
                    Date6
                ELSE
                    Date4
                END
            END
        ELSE
            CASE WHEN Date3 IS NULL OR Date3 > Date5 THEN
                CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                    Date6
                ELSE
                    Date5
                END
            ELSE
                CASE WHEN Date3 IS NULL OR Date3 > Date6 THEN
                    Date6
                ELSE
                    Date3
                END
            END
        END
    ELSE
        CASE WHEN Date1 IS NULL OR Date1 > Date4 THEN
            CASE WHEN Date4 IS NULL OR Date4 > Date5 THEN
                CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                    Date6
                ELSE
                    Date5
                END
            ELSE
                CASE WHEN Date4 IS NULL OR Date4 > Date6 THEN
                    Date6
                ELSE
                    Date4
                END
            END
        ELSE
            CASE WHEN Date1 IS NULL OR Date1 > Date5 THEN
                CASE WHEN Date5 IS NULL OR Date5 > Date6 THEN
                    Date6
                ELSE
                    Date5
                END
            ELSE
                CASE WHEN Date1 IS NULL OR Date1 > Date6 THEN
                    Date6
                ELSE
                    Date1
                END
            END
        END
    END
END

如果您偶然发现此页面只是为了比较日期，而又不关心性能或兼容性，则可以使用表值构造器，该表值构造器可在允许子选择的任何地方使用（SQL Server 2008及更高版本）：

Lowest =    
(
    SELECT MIN(TVC.d) 
    FROM 
    (
        VALUES
            (Date1), 
            (Date2), 
            (Date3), 
            (Date4), 
            (Date5), 
            (Date6)
    ) 
    AS TVC(d)
)

— 杰西·亚当
source

1

您的case陈述效率不高。在最坏的情况下，您进行5次比较，在最佳情况下，您进行2次；而找到的最小值n应最多n-1进行比较。

对于每一行，平均而言，您进行的是3.5次比较，而不是2次。因此，这将花费更多的CPU时间，而且速度较慢。使用以下case语句再次尝试测试。它仅每行使用2个比较，并且应该比unpivot和更有效union all。

Select Id, 
       Case 
           When Col1 <= Col2 then case when Col1 <= Col3 Then Col1  else col3 end
            When  Col2 <= Col3 Then Col2  
            Else Col3 
            End As TheMin 
From   YourTableNameHere

的 union all方法在您的情况下是错误的，因为您获得的最小值不是每行而是整个表的最小值。另外，由于您将扫描同一张桌子3次，因此效率不高。当表较小时，I / O不会有太大变化，但是对于大表而言，I / O会有所不同。不要使用该方法。

Unpivot很好，也可以尝试通过使用表的交叉联接来手动取消透视(select 1 union all select 2 union all select 3)。它应该和unpivot。

如果您没有空间问题，最好的解决方案是使用计算的持久化列。它会将行的大小增加4个字节（我想您将具有int类型），这又将增加表的大小。

但是，空间和内存在您的系统中是有问题的，因此，不要让CPU持久化，而不能使用case语句使用简单的计算列。它将使代码更简单。

— 古利·梅尔（Gulli Meel）
source

-1

我猜第一个选择是最快的（尽管从编程的角度看它看起来并不那么圆滑！）。这是因为它恰好处理了N行（其中N是表的大小），并且不需要像方法2或3那样进行搜索或排序。

用大量样本进行的测试应证明这一点。

还要考虑的另一种选择（好像您需要更多！）是在表上创建实例化视图。如果您的表大小在十万以上。这样，就可以在更改行的同时计算最小值，并且不必每次查询都处理整个表。在SQL Server中，物化视图称为索引视图

— 没有机会
source

-1

Create table #temp
   (
    id int identity(1,1),
    Name varchar(30),
    Year1 int,
    Year2 int,
    Year3 int,
    Year4 int
   )

   Insert into #temp values ('A' ,2015,2016,2014,2010)
   Insert into #temp values ('B' ,2016,2013,2017,2018)
   Insert into #temp values ('C' ,2010,2016,2014,2017)
   Insert into #temp values ('D' ,2017,2016,2014,2015)
   Insert into #temp values ('E' ,2016,2016,2016,2016)
   Insert into #temp values ('F' ,2016,2017,2018,2019)
   Insert into #temp values ('G' ,2016,2017,2020,2019)

   Select *, Case 
                 when Year1 >= Year2 and Year1 >= Year3 and Year1 >= Year4 then Year1
                 when Year2 >= Year3 and Year2 >= Year4 and Year2 >= Year1 then Year2
                 when Year3 >= Year4 and Year3 >= Year1 and Year3 >= Year2 then Year3
                 when Year4 >= Year1 and Year4 >= Year2 and Year4 >= Year3 then Year4  
                 else Year1 end as maxscore  
                 from #temp

— 拉维
source

您无需考虑NULL，这会使您的CASE表达式相对简单。但是，如果至少其中一列确实为NULL，则您的解决方案将Year1作为结果返回，这不一定是正确的。

— Andriy M 2015年