如何获取同一表上不同列的计数


15

表#01 Status

StatusID    Status
-----------------------
 1          Opened
 2          Closed
 3          ReOpened
 4          Pending

表#02 Claims

ClaimID     CompanyName StatusID
--------------------------------------
1               ABC     1
2               ABC     1
3               ABC     2
4               ABC     4
5               XYZ     1
6               XYZ     1

预期结果:

CompanyName TotalOpenClaims TotalClosedClaims TotalReOpenedClaims TotalPendingClaims
--------------------------------------------------------------------------------
ABC                 2           1                      0               1
XYZ                 2           0                      0               0

我该如何编写查询才能获得预期的结果?

Answers:


26

最简单SUM()CASE声明是:

select CompanyName, 
sum(case when StatusID=1 then 1 else 0 end) as TotalOpenClaims,
sum(case when StatusID=2 then 1 else 0 end) as TotalClosedClaims,
sum(case when StatusID=3 then 1 else 0 end) as TotalReOpenedClaims,
sum(case when StatusID=4 then 1 else 0 end) as TotalPendingClaims
from Claims
group by CompanyName;

16

这是一个典型的枢轴转换,Phil提出的条件聚合是实现它的一种好方法。

为了达到相同的结果,还有一种更现代的语法,它使用PIVOT子句:

SELECT
  CompanyName,
  TotalOpenClaims     = [1],
  TotalClosedClaims   = [2],
  TotalReOpenedClaims = [3],
  TotalPendingClaims  = [4]
FROM
  dbo.Claims
  PIVOT
  (
    COUNT(ClaimID)
    FOR StatusID IN ([1], [2], [3], [4])
  ) AS p
;

在内部,这种看起来更简单的语法等同于Phil的GROUP BY查询。更确切地说,它等效于以下变体:

SELECT
  CompanyName,
  TotalOpenClaims     = COUNT(CASE WHEN StatusID = 1 THEN ClaimID END),
  TotalClosedClaims   = COUNT(CASE WHEN StatusID = 2 THEN ClaimID END),
  TotalReOpenedClaims = COUNT(CASE WHEN StatusID = 3 THEN ClaimID END),
  TotalPendingClaims  = COUNT(CASE WHEN StatusID = 4 THEN ClaimID END)
FROM
  dbo.Claims
GROUP BY
  CompanyName
;

因此,PIVOT查询本质上是隐式的GROUP BY查询。

但是,与具有条件聚合的显式GROUP BY查询相比,PIVOT查询在处理方面非常棘手。使用PIVOT时,请务必牢记这一件事:

  • Claims在PIVOT子句未明确提及的数据透视所有列(在这种情况下)都是GROUP BY columns

如果Claims仅由示例中显示的三列组成,则上面的PIVOT查询将按预期工作,因为显然这CompanyName是PIVOT中未明确提及的唯一列,因此最终成为隐式GROUP BY的唯一条件。

但是,如果Claims还有其他列(例如ClaimDate),则它们将隐式用作其他GROUP BY列-也就是说,您的查询实际上将

GROUP BY CompanyName, ClaimDate, ... /* whatever other columns there are*/`

结果很可能不是您想要的。

不过,这很容易解决。为了排除不相关的列参与隐式分组,您可以仅使用派生表,在该表中,您将只选择结果所需的列,尽管这会使查询看起来不太美观:

SELECT
  CompanyName,
  TotalOpenClaims     = [1],
  TotalClosedClaims   = [2],
  TotalReOpenedClaims = [3],
  TotalPendingClaims  = [4]
FROM
  (SELECT ClaimID, CompanyName, StatusID FROM dbo.Claims) AS derived
  PIVOT
  (
    COUNT(ClaimID)
    FOR StatusID IN ([1], [2], [3], [4])
  ) AS p
;

但是,如果Claims已经是派生表,则无需添加另一级嵌套,只需确保在当前派生表中仅选择生成输出所需的列即可。

您可以在手册中阅读有关PIVOT的更多信息:


1

诚然,我的经验主要是使用MySQL,而我在SQL Server上的花费并不多。如果以下查询不起作用,我将感到非常惊讶:

SELECT 
  CompanyName, 
  status, 
  COUNT(status) AS 'Total Claims' 
FROM Claim AS c 
  JOIN Status AS s ON c.statusId = s.statusId 
GROUP BY 
  CompanyName, 
  status;

这不会以所需的格式为您提供输出,但是可以为您提供所有所需的信息,尽管省略了零例。与在查询中处理CASE语句相比,这对我来说要简单得多,如果仅将其用于格式化,这将是一个特别糟糕的主意。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.