如何在SQL运行计数器中找到“差距”?


106

我想在SQL表的计数器列中找到第一个“间隙”。例如,如果有值1,2,4和5,我想找出3。

我当然可以按顺序获取值并手动进行处理,但是我想知道是否可以在SQL中进行处理。

另外,它应该是非常标准的SQL,可以与不同的DBMS一起使用。


在Sql Server 2008及更高版本中,您可以使用LAG(id, 1, null)with OVER (ORDER BY id)子句。
ajeh

Answers:


184

MySQLPostgreSQL

SELECT  id + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )
ORDER BY
        id
LIMIT 1

SQL Server

SELECT  TOP 1
        id + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )
ORDER BY
        id

Oracle

SELECT  *
FROM    (
        SELECT  id + 1 AS gap
        FROM    mytable mo
        WHERE   NOT EXISTS
                (
                SELECT  NULL
                FROM    mytable mi 
                WHERE   mi.id = mo.id + 1
                )
        ORDER BY
                id
        )
WHERE   rownum = 1

ANSI (可在所有地方工作,效率最低):

SELECT  MIN(id) + 1
FROM    mytable mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    mytable mi 
        WHERE   mi.id = mo.id + 1
        )

支持滑动窗口功能的系统:

SELECT  -- TOP 1
        -- Uncomment above for SQL Server 2012+
        previd
FROM    (
        SELECT  id,
                LAG(id) OVER (ORDER BY id) previd
        FROM    mytable
        ) q
WHERE   previd <> id - 1
ORDER BY
        id
-- LIMIT 1
-- Uncomment above for PostgreSQL

39
@vulkanino:请要求他们保留缩进。另外请注意URL,尽管我认为它可能是QR编码的,但创用CC许可也要求您在我的昵称和问题上加上纹身。
Quassnoi

4
太好了,但是如果有的话[1, 2, 11, 12],那只会发现3。我希望找到的是3到10,基本上是每个差距的开始和结束。我知道我可能必须编写自己的利用SQL的python脚本(在我的情况下为MySql),但是如果SQL能使我更接近我想要的值(我的表中有200万行有间隙,因此,我需要将其切成小块并对其运行一些SQL)。我想我可以运行一个查询来找到一个间隙的起点,然后运行另一个查询来找到一个间隙的终点,然后它们“合并排序”两个序列。
Hamish Grubijan

1
@HamishGrubijan:请发布为另一个问题
Quassnoi

2
@Malkocoglu:如果表为空,将得到NULL,而不是0。对于所有数据库都是如此。
Quassnoi 2014年

5
这将无法正确找到初始差距。如果您有3,4,5,6,8。此代码将报告7,因为该代码甚至没有1。因此,如果您缺少起始编号,则必须进行检查。
ttomsen

12

如果您的第一个值id = 1,则所有答案都可以正常工作,否则将不会检测到该差距。例如,如果您的表ID值为3、4、5,则查询将返回6。

我做了这样的事情

SELECT MIN(ID+1) FROM (
    SELECT 0 AS ID UNION ALL 
    SELECT  
        MIN(ID + 1)
    FROM    
        TableX) AS T1
WHERE
    ID+1 NOT IN (SELECT ID FROM TableX) 

这将找到第一个缺口。如果您的ID为0、2、3、4。答案是1.我一直在寻找最大的差距。假设序列为0,2,3,4,100,101,102。我想找到4-99的差距。
Kemin Zhou

8

确实没有非常标准的SQL方法可以执行此操作,但是通过某种形式的限制子句,您可以执行此操作

SELECT `table`.`num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
LIMIT 1

(MySQL,PostgreSQL)

要么

SELECT TOP 1 `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL

(SQL Server)

要么

SELECT `num` + 1
FROM `table`
LEFT JOIN `table` AS `alt`
ON `alt`.`num` = `table`.`num` + 1
WHERE `alt`.`num` IS NULL
AND ROWNUM = 1

(甲骨文)


如果存在间隙范围,则将仅返回该范围的第一行用于postgres查询。
约翰·海格兰

这对我来说最有意义,使用联接还可以让您更改TOP值,以显示更多的缺口结果。
AJ_

1
谢谢,这很好用,如果您想查看所有有间隙的点,可以删除限制。
mekbib.awoke

8

我想到的第一件事。不确定完全采用这种方式是否是个好主意,但应该可以。假设表为t,列为c

SELECT t1.c+1 AS gap FROM t as t1 LEFT OUTER JOIN t as t2 ON (t1.c+1=t2.c) WHERE t2.c IS NULL ORDER BY gap ASC LIMIT 1

编辑:这可能是一个更快的滴答声(并且更短!):

SELECT min(t1.c)+1 AS gap FROM t as t1 LEFT OUTER JOIN t as t2 ON (t1.c+1=t2.c) WHERE t2.c IS NULL


左外侧联接t ==>左
外侧

1
不,Eamon,LEFT OUTER JOING t2要求您有t2一张桌子,这只是一个别名。
Michael Krelin-黑客

6

这在SQL Server中有效-无法在其他系统中进行测试,但似乎是标准的...

SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1))

您还可以在where子句中添加起点。

SELECT MIN(t1.ID)+1 FROM mytable t1 WHERE NOT EXISTS (SELECT ID FROM mytable WHERE ID = (t1.ID + 1)) AND ID > 2000

因此,如果您有2000、2001、2002和2005,而2003和2004不存在,它将返回2003。


3

以下解决方法:

  • 提供测试数据;
  • 产生其他差距的内部查询;和
  • 它适用于SQL Server 2012。

在“ with ”子句中对有序行进行顺序编号,然后在行号上进行内部联接重用两次结果,但是将其偏移1,以便将前一行与后一行进行比较,以查找间隙大于1.超出要求,但适用范围更广。

create table #ID ( id integer );

insert into #ID values (1),(2),    (4),(5),(6),(7),(8),    (12),(13),(14),(15);

with Source as (
    select
         row_number()over ( order by A.id ) as seq
        ,A.id                               as id
    from #ID as A WITH(NOLOCK)
)
Select top 1 gap_start from (
    Select 
         (J.id+1) as gap_start
        ,(K.id-1) as gap_end
    from       Source as J
    inner join Source as K
    on (J.seq+1) = K.seq
    where (J.id - (K.id-1)) <> 0
) as G

内部查询产生:

gap_start   gap_end

3           3

9           11

外部查询产生:

gap_start

3

2

内联到具有所有可能值的视图或序列。

没有桌子?摆一张桌子。我总是为此保留一个虚拟表。

create table artificial_range( 
  id int not null primary key auto_increment, 
  name varchar( 20 ) null ) ;

-- or whatever your database requires for an auto increment column

insert into artificial_range( name ) values ( null )
-- create one row.

insert into artificial_range( name ) select name from artificial_range;
-- you now have two rows

insert into artificial_range( name ) select name from artificial_range;
-- you now have four rows

insert into artificial_range( name ) select name from artificial_range;
-- you now have eight rows

--etc.

insert into artificial_range( name ) select name from artificial_range;
-- you now have 1024 rows, with ids 1-1024

然后,

 select a.id from artificial_range a
 where not exists ( select * from your_table b
 where b.counter = a.id) ;

2

对于 PostgreSQL

一个使用递归查询的示例。

如果您想在特定范围内找到一个间隙,这可能会很有用(即使表为空,它也将起作用,而其他示例则不会)

WITH    
    RECURSIVE a(id) AS (VALUES (1) UNION ALL SELECT id + 1 FROM a WHERE id < 100), -- range 1..100  
    b AS (SELECT id FROM my_table) -- your table ID list    
SELECT a.id -- find numbers from the range that do not exist in main table
FROM a
LEFT JOIN b ON b.id = a.id
WHERE b.id IS NULL
-- LIMIT 1 -- uncomment if only the first value is needed


1

这说明了到目前为止提到的所有内容。它以0为起点,如果也没有值,则默认为0。我还为多值键的其他部分添加了适当的位置。仅在SQL Server上对此进行了测试。

select
    MIN(ID)
from (
    select
        0 ID
    union all
    select
        [YourIdColumn]+1
    from
        [YourTable]
    where
        --Filter the rest of your key--
    ) foo
left join
    [YourTable]
    on [YourIdColumn]=ID
    and --Filter the rest of your key--
where
    [YourIdColumn] is null

1

我写了一个快速的方法。不确定这是最有效的,但是可以完成工作。请注意,它不会告诉您间隙,而是告诉您间隙之前和之后的ID(请注意,间隙可以是多个值,例如1,2,4,7,11等)

我以sqlite为例

如果这是您的表结构

create table sequential(id int not null, name varchar(10) null);

这些是你的行

id|name
1|one
2|two
4|four
5|five
9|nine

查询是

select a.* from sequential a left join sequential b on a.id = b.id + 1 where b.id is null and a.id <> (select min(id) from sequential)
union
select a.* from sequential a left join sequential b on a.id = b.id - 1 where b.id is null and a.id <> (select max(id) from sequential);

https://gist.github.com/wkimeria/7787ffe84d1c54216f1b320996b17b7e


0
select min([ColumnName]) from [TableName]
where [ColumnName]-1 not in (select [ColumnName] from [TableName])
and [ColumnName] <> (select min([ColumnName]) from [TableName])


0

它适用于空表或负值。刚刚在SQL Server 2012中测试

 select min(n) from (
select  case when lead(i,1,0) over(order by i)>i+1 then i+1 else null end n from MyTable) w

0

如果您使用Firebird 3,这是最优雅,最简单的方法:

select RowID
  from (
    select `ID_Column`, Row_Number() over(order by `ID_Column`) as RowID
      from `Your_Table`
        order by `ID_Column`)
    where `ID_Column` <> RowID
    rows 1

0
            -- PUT THE TABLE NAME AND COLUMN NAME BELOW
            -- IN MY EXAMPLE, THE TABLE NAME IS = SHOW_GAPS AND COLUMN NAME IS = ID

            -- PUT THESE TWO VALUES AND EXECUTE THE QUERY

            DECLARE @TABLE_NAME VARCHAR(100) = 'SHOW_GAPS'
            DECLARE @COLUMN_NAME VARCHAR(100) = 'ID'


            DECLARE @SQL VARCHAR(MAX)
            SET @SQL = 
            'SELECT  TOP 1
                    '+@COLUMN_NAME+' + 1
            FROM    '+@TABLE_NAME+' mo
            WHERE   NOT EXISTS
                    (
                    SELECT  NULL
                    FROM    '+@TABLE_NAME+' mi 
                    WHERE   mi.'+@COLUMN_NAME+' = mo.'+@COLUMN_NAME+' + 1
                    )
            ORDER BY
                    '+@COLUMN_NAME

            -- SELECT @SQL

            DECLARE @MISSING_ID TABLE (ID INT)

            INSERT INTO @MISSING_ID
            EXEC (@SQL)

            --select * from @MISSING_ID

            declare @var_for_cursor int
            DECLARE @LOW INT
            DECLARE @HIGH INT
            DECLARE @FINAL_RANGE TABLE (LOWER_MISSING_RANGE INT, HIGHER_MISSING_RANGE INT)
            DECLARE IdentityGapCursor CURSOR FOR   
            select * from @MISSING_ID
            ORDER BY 1;  

            open IdentityGapCursor

            fetch next from IdentityGapCursor
            into @var_for_cursor

            WHILE @@FETCH_STATUS = 0  
            BEGIN
            SET @SQL = '
            DECLARE @LOW INT
            SELECT @LOW = MAX('+@COLUMN_NAME+') + 1 FROM '+@TABLE_NAME
                    +' WHERE '+@COLUMN_NAME+' < ' + cast( @var_for_cursor as VARCHAR(MAX))

            SET @SQL = @sql + '
            DECLARE @HIGH INT
            SELECT @HIGH = MIN('+@COLUMN_NAME+') - 1 FROM '+@TABLE_NAME
                    +' WHERE '+@COLUMN_NAME+' > ' + cast( @var_for_cursor as VARCHAR(MAX))

            SET @SQL = @sql + 'SELECT @LOW,@HIGH'

            INSERT INTO @FINAL_RANGE
             EXEC( @SQL)
            fetch next from IdentityGapCursor
            into @var_for_cursor
            END

            CLOSE IdentityGapCursor;  
            DEALLOCATE IdentityGapCursor;  

            SELECT ROW_NUMBER() OVER(ORDER BY LOWER_MISSING_RANGE) AS 'Gap Number',* FROM @FINAL_RANGE

0

发现大多数方法在中运行非常非常mysql。这是我的解决方案mysql < 8.0。在1M记录上进行了测试,间隔接近尾声〜1秒才能完成。不确定是否适合其他SQL版本。

SELECT cardNumber - 1
FROM
    (SELECT @row_number := 0) as t,
    (
        SELECT (@row_number:=@row_number+1), cardNumber, cardNumber-@row_number AS diff
        FROM cards
        ORDER BY cardNumber
    ) as x
WHERE diff >= 1
LIMIT 0,1
我假设序列从“ 1”开始。

0

如果您的计数器从1开始,并且您想在空时生成序列的第一个数字(1),那么以下是对Oracle有效的第一个答案中的经过纠正的代码段:

SELECT
  NVL(MIN(id + 1),1) AS gap
FROM
  mytable mo  
WHERE 1=1
  AND NOT EXISTS
      (
       SELECT  NULL
       FROM    mytable mi 
       WHERE   mi.id = mo.id + 1
      )
  AND EXISTS
     (
       SELECT  NULL
       FROM    mytable mi 
       WHERE   mi.id = 1
     )  

0
DECLARE @Table AS TABLE(
[Value] int
)

INSERT INTO @Table ([Value])
VALUES
 (1),(2),(4),(5),(6),(10),(20),(21),(22),(50),(51),(52),(53),(54),(55)
 --Gaps
 --Start    End     Size
 --3        3       1
 --7        9       3
 --11       19      9
 --23       49      27


SELECT [startTable].[Value]+1 [Start]
     ,[EndTable].[Value]-1 [End]
     ,([EndTable].[Value]-1) - ([startTable].[Value]) Size 
 FROM 
    (
SELECT [Value]
    ,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM @Table
)AS startTable
JOIN 
(
SELECT [Value]
,ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY [Value]) Record
FROM @Table
)AS EndTable
ON [EndTable].Record = [startTable].Record+1
WHERE [startTable].[Value]+1 <>[EndTable].[Value]

0

如果列中的数字是正整数(从1开始),那么这是轻松解决的方法。(假设ID是您的列名)

    SELECT TEMP.ID 
    FROM (SELECT ROW_NUMBER() OVER () AS NUM FROM 'TABLE-NAME') AS TEMP 
    WHERE ID NOT IN (SELECT ID FROM 'TABLE-NAME')
    ORDER BY 1 ASC LIMIT 1

它只会发现间隙直到“ TABLE-NAME”中的行数为“ SELECT ROW_NUMBER()OVER()AS NUM FROM'TABLE-NAME'”才会给出ID直到仅行数
vijay shanker
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.