从表中查找“ n”个连续的免费电话


16

我有一些这样的数字表(状态为免费或已分配)

id_set号状态         
-----------------------
1 000001已分配
1 000002免费
1 000003已分配
1 000004免费
1 000005免费
1 000006已分配
1 000007已分配
1 000008免费
1 000009免费
1 000010免费
1 000011已分配
1 000012分配
1 000013已分配
1 000014免费
1 000015已分配

并且我需要找到“ n”个连续数字,因此对于n = 3,查询将返回

1 000008免费
1 000009免费
1 000010免费

它应仅返回每个id_set的第一个可能的组(实际上,每个查询仅对id_set执行一次)

我正在检查WINDOW函数,尝试了诸如之类的查询COUNT(id_number) OVER (PARTITION BY id_set ROWS UNBOUNDED PRECEDING),但这就是我得到的全部:)我想不出逻辑,在Postgres中该怎么做。

我在考虑使用WINDOW函数创建虚拟列,该虚拟列为status ='FREE'的每个数字计算前面的行,然后选择第一个数字,其中count等于我的“ n”数字。

或按状态分组,但只能从一个已分配到另一个已分配,并仅选择至少包含“ n”个数字的组

编辑

我找到了这个查询(并做了一点修改)

WITH q AS
(
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY id_set, status ORDER BY number) AS rnd,
         ROW_NUMBER() OVER (PARTITION BY id_set ORDER BY number) AS rn
  FROM numbers
)
SELECT id_set,
       MIN(number) AS first_number,
       MAX(number) AS last_number,
       status,
       COUNT(number) AS numbers_count
FROM q
GROUP BY id_set,
         rnd - rn,
         status
ORDER BY
     first_number

它会产生一组免费/已分配的号码,但我想只从满足条件的第一组中获得所有号码

SQL小提琴

Answers:


16

这是一个问题。假设同一id_set集合中没有空白或重复:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
)
SELECT
  id_set,
  number
FROM counted
WHERE cnt >= 3
;

这是此查询的SQL Fiddle演示*链接:http : //sqlfiddle.com/# !1 / a2633/1

更新

要只返回一组,可以再增加一轮排名:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
),
ranked AS (
  SELECT
    *,
    RANK() OVER (ORDER BY id_set, grp) AS rnk
  FROM counted
  WHERE cnt >= 3
)
SELECT
  id_set,
  number
FROM ranked
WHERE rnk = 1
;

这也是这个的演示:http : //sqlfiddle.com/# ! 1/ a2633/2

如果您需要将其设置id_setRANK()请按如下所示更改通话:

RANK() OVER (PARTITION BY id_set ORDER BY grp) AS rnk

另外,您可以使查询返回最小的匹配集(例如,如果存在,则首先尝试返回恰好三个连续数字的第一组,否则,返回四个,五个等):

RANK() OVER (ORDER BY cnt, id_set, grp) AS rnk

或这样(每个id_set):

RANK() OVER (PARTITION BY id_set ORDER BY cnt, grp) AS rnk

*此答案中链接的SQL Fiddle演示使用9.1.8实例,因为9.2.1目前似乎无法正常工作。


非常感谢,这看起来不错,但是可以更改它,以便仅返回第一组数字?如果我将其更改为cnt> = 2,那么我得到5个数字(2个组= 2 + 3个数字)
boobiq 2013年

@boobiq:您要一个id_set还是一个?请从一开始就更新您的问题。(以便其他人可以看到完整的要求并提出建议或更新答案。)
Andriy M

我编辑了我的问题(在想要返回之后),它将仅对一个id_set执行,因此仅找到了第一个可能的组
boobiq 2013年

10

一个简单而快速的变体:

SELECT min(number) AS first_number, count(*) AS ct_free
FROM (
    SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
    FROM   tbl
    WHERE  status = 'FREE'
    ) x
GROUP  BY grp
HAVING count(*) >= 3  -- minimum length of sequence only goes here
ORDER  BY grp
LIMIT  1;
  • 需要一个无间断的数字序列number(如问题所提供)。

  • status除之外'FREE',还可以用于除此之外的任何数量的可能值NULL

  • 主要特点是减去row_number()number消除非限定行之后。连续数以相同的结尾grp-并且grp也保证按升序排列

  • 然后,您可以GROUP BY grp计算成员人数。因为您似乎想要第一次出现,所以ORDER BY grp LIMIT 1您获得了序列的起始位置和长度(可以是> = n)。

行集

要获取一组实际的数字,请勿再次查找表格。便宜得多generate_series()

SELECT generate_series(first_number, first_number + ct_free - 1)
    -- generate_series(first_number, first_number + 3 - 1) -- only 3
FROM  (
   SELECT min(number) AS first_number, count(*) AS ct_free
   FROM  (
      SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
      FROM   tbl
      WHERE  status = 'FREE'
      ) x
   GROUP  BY grp
   HAVING count(*) >= 3
   ORDER  BY grp
   LIMIT  1
   ) y;

如果您确实希望像示例值中那样显示带有前导零的字符串,请to_char()FM(填充模式)修饰符一起使用:

SELECT to_char(generate_series(8, 11), 'FM000000')

具有扩展测试用例和两个查询的 SQL Fiddle

密切相关的答案:


8

这是一种相当通用的方法。

请记住,这取决于您的number列是连续的。如果不是Window函数和/或CTE类型解决方案,可能需要:

SELECT 
    number
FROM
    mytable m
CROSS JOIN
   (SELECT 3 AS consec) x
WHERE 
    EXISTS
       (SELECT 1 
        FROM mytable
        WHERE number = m.number - x.consec + 1
        AND status = 'FREE')
    AND NOT EXISTS
       (SELECT 1 
        FROM mytable
        WHERE number BETWEEN m.number - x.consec + 1 AND m.number
        AND status = 'ASSIGNED')

该声明将不会在Postgres中工作。
a_horse_with_no_name

@a_horse_with_no_name请随时解决:)
JNK

没有窗口功能,非常好!尽管我认为应该是M.number-consec+1(例如10个10-3+1=8)。
Andriy M

@AndriyM好吧,它不是脆弱的,因为它依赖于该number字段的顺序值。好数学,我会纠正的。
JNK

2
我自由地修复了Postgres的语法。第一个EXISTS可以简化。因为我们只需要确保任何 ň早先的行存在,我们可以删除AND status = 'FREE'。而且我会在第二变化的条件EXISTS,以status <> 'FREE'对抗在未来添加的选项变硬。
Erwin Brandstetter

5

这将仅返回3个数字中的第一个。不需要的值number是连续的。在SQL-Fiddle上测试:

WITH cte3 AS
( SELECT
    *,
    COUNT(CASE WHEN status = 'FREE' THEN 1 END) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
      AS cnt
  FROM atable
)
SELECT
  id_set, number
FROM cte3
WHERE cnt = 3 ;

这将显示所有数字(连续3个或更多'FREE'位置):

WITH cte3 AS
( SELECT
    *,
    COUNT(CASE WHEN status = 'FREE' THEN 1 END) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
      AS cnt
  FROM atable
)
, cte4 AS
( SELECT
    *, 
    MAX(cnt) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
      AS maxcnt
  FROM cte3
)
SELECT
  id_set, number
FROM cte4
WHERE maxcnt >= 3 ;

0
select r1.number from some_table r1, 
some_table r2,
some_table r3,
some_table r4 
where r3.number <= r2.number 
and r3.number >= r1.number 
and r3.status = 'FREE' 
and r2.number = r1.number + 4 
and r4.number <= r2.number 
and r4.number >= r1.number 
and r4.status = 'ASSIGNED'
group by r1.number, r2.number having count(r3.number) = 5 and count(r4.number) = 0 order by r1.number asc limit 1 ;

在这种情况下,有5个连续的数字-因此,差必须为4或换句话说count(r3.number) = nr2.number = r1.number + n - 1

与联接:

select r1.number 
from some_table r1 join 
 some_table r2 on (r2.number = r1.number + :n -1) join
 some_table r3 on (r3.number <= r2.number and r3.number >= r1.number) join
 some_table r4 on (r4.number <= r2.number and r4.number >= r1.number)
where  
 r3.status = 'FREE' and
 r4.status = 'ASSIGNED'
group by r1.number, r2.number having count(r3.number) = :n and count(r4.number) = 0 order by r1.number asc limit 1 ;

您认为四向笛卡尔乘积是执行此操作的有效方法吗?
JNK

另外,您可以用现代JOIN语法编写它吗?
JNK

好吧,我不想依赖于窗口函数,而是提供了一个适用于任何sql-db的解决方案。
Ununoctium 2013年

-1
CREATE TABLE #ConsecFreeNums
(
     id_set BIGINT
    ,number VARCHAR(10)
    ,status VARCHAR(10)
)

CREATE TABLE #ConsecFreeNumsResult
(
     Seq    INT
    ,id_set BIGINT
    ,number VARCHAR(10)
    ,status VARCHAR(10)
)

INSERT #ConsecFreeNums
SELECT 1, '000002', 'FREE' UNION
SELECT 1, '000003', 'ASSIGNED' UNION
SELECT 1, '000004', 'FREE' UNION
SELECT 1, '000005', 'FREE' UNION
SELECT 1, '000006', 'ASSIGNED' UNION
SELECT 1, '000007', 'ASSIGNED' UNION
SELECT 1, '000008', 'FREE' UNION
SELECT 1, '000009', 'FREE' UNION
SELECT 1, '000010', 'FREE' UNION
SELECT 1, '000011', 'ASSIGNED' UNION
SELECT 1, '000012', 'ASSIGNED' UNION
SELECT 1, '000013', 'ASSIGNED' UNION
SELECT 1, '000014', 'FREE' UNION
SELECT 1, '000015', 'ASSIGNED'

DECLARE @id_set AS BIGINT, @number VARCHAR(10), @status VARCHAR(10), @number_count INT, @number_count_check INT

DECLARE ConsecFreeNumsCursor CURSOR FAST_FORWARD FOR
SELECT
       id_set
      ,number
      ,status
 FROM
      #ConsecFreeNums
WHERE id_set = 1
ORDER BY number

OPEN ConsecFreeNumsCursor

FETCH NEXT FROM ConsecFreeNumsCursor INTO @id_set, @number, @status

SET @number_count_check = 3
SET @number_count = 0

WHILE @@FETCH_STATUS = 0
BEGIN
    IF @status = 'ASSIGNED'
    BEGIN
        IF @number_count = @number_count_check
        BEGIN
            SELECT 'Results'
            SELECT * FROM #ConsecFreeNumsResult ORDER BY number
            BREAK
        END
        SET @number_count = 0
        TRUNCATE TABLE #ConsecFreeNumsResult
    END
    ELSE
    BEGIN
        SET @number_count = @number_count + 1
        INSERT #ConsecFreeNumsResult SELECT @number_count, @id_set, @number, @status
    END
    FETCH NEXT FROM ConsecFreeNumsCursor INTO @id_set, @number, @status
END

CLOSE ConsecFreeNumsCursor
DEALLOCATE ConsecFreeNumsCursor

DROP TABLE #ConsecFreeNums
DROP TABLE #ConsecFreeNumsResult

我正在使用游标以提高性能-SELECT是否应返回大量行
Ravi Ramaswamy13年

我通过突出显示代码并按下{ }编辑器上的按钮来重新格式化您的答案。请享用!
jcolebrand

您可能还希望编辑答案,并告诉您为什么您认为光标可以提供更好的性能。
jcolebrand

游标是一个顺序过程。这几乎就像一次读取一个平面文件的一个记录。在一种情况下,我用一个光标替换了MEM TEMP表。这将处理时间从26小时减少到6小时。我必须使用预设的WHILE来遍历结果集。
拉维·拉马斯瓦米

您是否曾经尝试过检验您的假设?您可能会感到惊讶。除特殊情况外,纯SQL最快。
Erwin Brandstetter
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.