查找唯一的天数


11

我希望编写一个SQL查询来从表中查找每个员工的唯一工作日数times

*---------------------------------------*
|emp_id  task_id  start_day   end_day   |
*---------------------------------------*
|  1        1     'monday'  'wednesday' |
|  1        2     'monday'  'tuesday'   |
|  1        3     'friday'  'friday'    |
|  2        1     'monday'  'friday'    |
|  2        1     'tuesday' 'wednesday' |
*---------------------------------------*

预期产量:

*-------------------*
|emp_id  no_of_days |
*-------------------*
|  1        4       |
|  2        5       |
*-------------------*

我已经编写了查询sqlfiddle,它为我提供了expected输出,但是出于好奇,是否有更好的方法编写此查询?我可以使用日历或理货桌吗?

with days_num as  
(
  select
    *,
    case 
      when start_day = 'monday' then 1
      when start_day = 'tuesday' then 2
      when start_day = 'wednesday' then 3
      when start_day = 'thursday' then 4
      when start_day = 'friday' then 5
    end as start_day_num,

    case 
      when end_day = 'monday' then 1
      when end_day = 'tuesday' then 2
      when end_day = 'wednesday' then 3
      when end_day = 'thursday' then 4
      when end_day = 'friday' then 5
    end as end_day_num

  from times
),
day_diff as
(
  select
    emp_id,
    case
      when  
        (end_day_num - start_day_num) = 0
      then
        1
      else
        (end_day_num - start_day_num)
    end as total_diff
  from days_num  
)

select emp_id,
  sum(total_diff) as uniq_working_days
from day_diff
group by
  emp_id

任何建议都很好。


对于值(1, 1, 'monday', 'wednesday'),(1, 2, 'monday', 'tuesday'),(1, 3, 'monday', 'tuesday');empid_1已经工作了3个不同的天(星期一,星期二,星期三),提琴/查询返回4
lptr

1
@lptr (1, 1, 'monday', 'wednesday'),(1, 2, 'monday', 'tuesday'),(1, 3, 'friday', 'friday');
热心

3
您的查询实际上不起作用。如果更改1 2 'monday' 'tuesday'1 2 'monday' 'wednesday'结果,仍应为4天,但返回5
Nick

Answers:


5

您基本上需要找到emp_id每个task工作日与一周中所有工作日的交集,然后计算不同的天数:

with days_num as (
  SELECT *
  FROM (
    VALUES ('monday', 1), ('tuesday', 2), ('wednesday', 3), ('thursday', 4), ('friday', 5)
  ) AS d (day, day_no)
),
emp_day_nums as (
  select emp_id, d1.day_no AS start_day_no, d2.day_no AS end_day_no
  from times t
  join days_num d1 on d1.day = t.start_day
  join days_num d2 on d2.day = t.end_day
)
select emp_id, count(distinct d.day_no) AS distinct_days
from emp_day_nums e
join days_num d on d.day_no between e.start_day_no and e.end_day_no
group by emp_id

输出:

emp_id  distinct_days
1       4
2       5

关于SQLFiddle的演示


我写我的时候没有看到你的答案。现在,我发现我正在使事情变得不必要的复杂。我喜欢你的解决方案。
Thorsten Kettner

2
@ThorstenKettner是的-我最初是从我自己开始递归CTE路径,但是意识到使用a join可以between使条件更容易达到相同的结果……
Nick

6

简化问题(小提琴)中的语句的一种可能方法是使用VALUES表值构造函数和适当的联接:

SELECT 
   t.emp_id,
   SUM(CASE 
      WHEN d1.day_no = d2.day_no THEN 1
      ELSE d2.day_no - d1.day_no
   END) AS no_of_days
FROM times t
JOIN (VALUES ('monday', 1), ('tuesday', 2), ('wednesday', 3), ('thursday', 4), ('friday', 5)) d1 (day, day_no) 
   ON t.start_day = d1.day
JOIN (VALUES ('monday', 1), ('tuesday', 2), ('wednesday', 3), ('thursday', 4), ('friday', 5)) d2 (day, day_no) 
   ON t.end_day = d2.day
GROUP BY t.emp_id

但是,如果您要计算不同的天数,则说明会有所不同。您需要查找介于start_dayend_day范围之间的所有日期,并计算不同的日期:

;WITH daysCTE (day, day_no) AS (
   SELECT 'monday', 1 UNION ALL
   SELECT 'tuesday', 2 UNION ALL
   SELECT 'wednesday', 3 UNION ALL
   SELECT 'thursday', 4 UNION ALL
   SELECT 'friday', 5 
)
SELECT t.emp_id, COUNT(DISTINCT d3.day_no)
FROM times t
JOIN daysCTE d1 ON t.start_day = d1.day
JOIN daysCTE d2 ON t.end_day = d2.day
JOIN daysCTE d3 ON d3.day_no BETWEEN d1.day_no AND d2.day_no
GROUP BY t.emp_id

该查询(如有机磷农药原始查询)不工作,如果你改变1 2 'monday' 'tuesday' 1 2 'monday' 'wednesday' 结果应该仍然为4天,但它返回5
尼克

@尼克,对不起,我听不懂。根据操作说明,monday与之间有2天的时间wednesday。我想念什么吗?
佐罗夫

按照我的描述更改输入数据,您的查询将返回5。但是,答案仍然应该是4,因为仍然只有4个唯一的工作日。
尼克

@尼克,现在我明白你的意思了。但是,如果我更改OP提琴中的值,则结果将5不是4。该答案仅表明更简单的陈述。谢谢。
佐罗夫

OPs查询也是错误的。该数据的正确答案是4,因为只有4天不重复。
尼克

2

您的查询不正确。尝试周一至周二以及周三至周四。这将导致4天,但您的查询将返回2天。您的查询甚至没有检测到两个范围是相邻的还是重叠的,或者都不是。

解决此问题的一种方法是编写一个递归CTE以从某个范围中获取所有天数,然后计算不同的天数。

with weekdays (day_name, day_number) as
(
  select * from (values ('monday', 1), ('tuesday', 2), ('wednesday', 3),
                        ('thursday', 4), ('friday', 5)) as t(x,y)
)
, emp_days(emp_id, day, last_day)
as
(
  select emp_id, wds.day_number, wde.day_number
  from times t
  join weekdays wds on wds.day_name = t.start_day
  join weekdays wde on wde.day_name = t.end_day
  union all
  select emp_id, day + 1, last_day
  from emp_days
  where day < last_day
)
select emp_id, count(distinct day)
from emp_days
group by emp_id
order by emp_id;

演示:http : //sqlfiddle.com/#!18/4a5ac/16

(可以看出,我无法像在中那样直接应用值构造函数with weekdays (day_name, day_number) as (values ('monday', 1), ...)。我不知道为什么。那是SQL Server还是我?好吧,通过附加选择它可以工作:-)


2
with cte as 
(Select id, start_day as day
   group by id, start_day
 union 
 Select id, end_day as day
   group by id, end_day
)

select id, count(day)
from cte
group by id

3
通过仅对代码的答案进行解释,几乎总是可以通过对它们如何工作以及为什么起作用的一些解释加以改进。
Jason Aller

1
欢迎使用Stack Overflow!尽管这段代码可以解决问题,但包括解释如何以及为什么解决该问题的说明,确实可以帮助提高您的帖子质量,并可能导致更多的投票。请记住,您将来会为读者回答问题,而不仅仅是现在问的人。请编辑您的答案以添加说明,并指出适用的限制和假设。From Review
双响

1
declare @times table
(
  emp_id int,
  task_id int,
  start_day varchar(50),
  end_day varchar(50)
);

insert into @times(emp_id, task_id, start_day, end_day)
values
(1, 1, 'monday', 'wednesday'),
(1, 2, 'monday', 'tuesday'),
(1, 3, 'friday', 'friday'),
--
(2, 1, 'monday', 'friday'),
(2, 2, 'tuesday', 'wednesday'),
--
(3, 1, 'monday', 'wednesday'),
(3, 2, 'monday', 'tuesday'),
(3, 3, 'monday', 'tuesday');

--for sql 2019, APPROX_COUNT_DISTINCT() eliminates distinct sort (!!)...
-- ...with a clustered index on emp_id (to eliminate the hashed aggregation) the query cost gets 5 times cheaper ("overlooking" the increase in memory) !!??!!
/*
select t.emp_id, APPROX_COUNT_DISTINCT(v.val) as distinctweekdays
from
(
select *, .........
*/


select t.emp_id, count(distinct v.val) as distinctweekdays
from
(
select *, 
case start_day when 'monday' then 1
      when 'tuesday' then 2
      when 'wednesday' then 3
      when 'thursday' then 4
      when 'friday' then 5
    end as start_day_num,
case end_day when 'monday' then 1
      when 'tuesday' then 2
      when 'wednesday' then 3
      when 'thursday' then 4
      when 'friday' then 5
    end as end_day_num
from @times
) as t
join (values(1),(2), (3), (4), (5)) v(val) on v.val between t.start_day_num and t.end_day_num
group by t.emp_id;

1
要求您编写代码描述,它如何工作?
Suraj Kumar
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.