使用窗口函数在分区中结转第一个非空值


12

考虑一个记录访问量的表

create table visits (
  person varchar(10),
  ts timestamp, 
  somevalue varchar(10) 
)

考虑这个示例数据(时间戳简化为计数器)

ts| person    |  somevalue
-------------------------
1 |  bob      |null
2 |  bob      |null
3 |  jim      |null
4 |  bob      |  A
5 |  bob      | null
6 |  bob      |  B
7 |  jim      |  X
8 |  jim      |  Y
9 |  jim      |  null

我正在尝试将该人的最后一个非空的somevalue延续到他以后的所有访问中,直到该值改变(即成为下一个非空)的值。

预期结果集如下所示:

ts|  person   | somevalue | carry-forward 
-----------------------------------------------
1 |  bob      |null       |   null
2 |  bob      |null       |   null
3 |  jim      |null       |   null
4 |  bob      |  A        |    A
5 |  bob      | null      |    A
6 |  bob      |  B        |    B
7 |  jim      |  X        |    X
8 |  jim      |  Y        |    Y
9 |  jim      |  null     |    Y

我的尝试如下所示:

 select *, 
  first_value(somevalue) over (partition by person order by (somevalue is null), ts rows between UNBOUNDED PRECEDING AND current row  ) as carry_forward

 from visits  
 order by ts

注意:出于排序目的,(somevalue为null)求值为1或0,因此我可以获取分区中的第一个非null值。

上面没有给我我想要的结果。


您能只pg_dump为测试数据粘贴,而不是将数据粘贴到psql输出以及表的模式中吗?pg_dump -t table -d database我们需要创建和COPY命令。
埃文·卡罗尔


1
@a_horse_with_no_name应该是一个答案。
ypercubeᵀᴹ

Answers:


12

以下查询可实现所需的结果:

select *, first_value(somevalue) over w as carryforward_somevalue
from (
  select *, sum(case when somevalue is null then 0 else 1 end) over (partition by person order by id ) as value_partition
  from test1

) as q
window w as (partition by person, value_partition order by id);

请注意为null的情况下的语句-如果postgres窗口函数支持IGNORE_NULL,则不需要(如@ypercubeᵀᴹ所述)


5
也简单count(somevalue) over (...)
ypercubeᵀᴹ

5

问题属于问题的空白。遗憾的是Postgres尚未IGNORE NULL在窗口函数(如)中实现FIRST_VALUE(),否则将是微不足道的,只需对查询进行简单的更改即可。

使用窗口函数或递归CTE可以解决许多问题。

不确定这是否是最有效的方法,但是递归CTE确实可以解决问题:

with recursive 
    cf as
    (
      ( select distinct on (person) 
            v.*, v.somevalue as carry_forward
        from visits as v
        order by person, ts
      ) 
      union all
        select 
            v.*, coalesce(v.somevalue, cf.carry_forward)
        from cf
          join lateral  
            ( select v.*
              from visits as v
              where v.person = cf.person
                and v.ts > cf.ts
              order by ts
              limit 1
            ) as v
            on true
    )
select cf.*
from cf 
order by ts ;

它确实解决了问题,但是它比需要的复杂。请参阅下面的答案
maxTrialfire,2016年

1
是的,您的答案似乎不错!
ypercubeᵀᴹ
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.