选择最晚日期或最晚日期


15

这是两个表。

学校员工

SCHOOL_CODE + STAFF_TYPE_NAME + LAST_UPDATE_DATE_TIME + PERSON_ID
=================================================================
ABE           Principal         24-JAN-13               111222
ABE           Principal         09-FEB-12               222111

人事

PERSON_ID + NAME
=================
111222      ABC
222111      XYZ

这是我的oracle查询。

SELECT MAX(LAST_UPDATE_DATE_TIME) AS LAST_UPDATE, SCHOOL_CODE, PERSON_ID
FROM SCHOOL_STAFF
WHERE STAFF_TYPE_NAME='Principal'
GROUP BY SCHOOL_CODE, PERSON_ID
ORDER BY SCHOOL_CODE;

这给出了这个结果

LAST_UPDATE SCHOOL_CODE PERSON_ID
===========+===========+=========
24-JAN-13   ABE         111222
09-FEB-12   ABE         222111

我想为日期最近的学校选择第一个。

谢谢。

Answers:


28

您当前的查询未提供期望的结果,因为您正在使用列中的一个GROUP BY子句,该子句的PERSON_ID两个条目都具有唯一值。结果,您将返回两行。

有几种方法可以解决此问题。您可以使用子查询来应用聚合函数以返回max(LAST_UPDATE_DATE_TIME)每个的SCHOOL_CODE

select s1.LAST_UPDATE_DATE_TIME,
  s1.SCHOOL_CODE,
  s1.PERSON_ID
from SCHOOL_STAFF s1
inner join
(
  select max(LAST_UPDATE_DATE_TIME) LAST_UPDATE_DATE_TIME,
    SCHOOL_CODE
  from SCHOOL_STAFF
  group by SCHOOL_CODE
) s2
  on s1.SCHOOL_CODE = s2.SCHOOL_CODE
  and s1.LAST_UPDATE_DATE_TIME = s2.LAST_UPDATE_DATE_TIME;

参见带有演示的SQL Fiddle

或者,您可以使用窗口功能返回每个学校的最新数据行LAST_UPDATE_DATE_TIME

select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME
from
(
  select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME,
    row_number() over(partition by SCHOOL_CODE 
                        order by LAST_UPDATE_DATE_TIME desc) seq
  from SCHOOL_STAFF
  where STAFF_TYPE_NAME='Principal'
) d
where seq = 1;

参见带有演示的SQL Fiddle

此查询实现row_number()为的分区中的每一行分配一个唯一的编号,SCHOOL_CODE并基于降序排列LAST_UPDATE_DATE_TIME

作为附带说明,具有聚合功能的JOIN与row_number()版本不完全相同。如果您有两行具有相同的事件时间,则JOIN将返回两行,而row_number()只会返回一。如果要通过窗口函数返回两者,请考虑改用rank()窗口函数,因为它将返回联系:

select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME
from
(
  select SCHOOL_CODE, PERSON_ID, LAST_UPDATE_DATE_TIME,
    rank() over(partition by SCHOOL_CODE 
                        order by LAST_UPDATE_DATE_TIME desc) seq
  from SCHOOL_STAFF
  where STAFF_TYPE_NAME='Principal'
) d
where seq = 1;

观看演示


4

我很惊讶没有人利用row_number()之外的窗口函数

这里有一些数据可以玩:

CREATE TABLE SCHOOL_STAFF
(
LAST_UPDATE_DATE_TIME VARCHAR(20),
SCHOOL_CODE VARCHAR(20),
PERSON_ID VARCHAR(20),
STAFF_TYPE_NAME VARCHAR(20)
);
INSERT INTO SCHOOL_STAFF VALUES ('24-JAN-13', 'ABE', '111222', 'Principal');
INSERT INTO SCHOOL_STAFF VALUES ('09-FEB-12', 'ABE', '222111', 'Principal');

OVER()子句创建一个窗口,您将为其定义聚合组。在这种情况下,我仅在SHOOL_CODE上进行分区,因此我们将看到FIRST_VALUE,它来自LAST_UPDATE_DATE_TIME,由SCHOOL_CODE分组,并按LAST_UPDATE_DATE_TIME的降序排列。该值将应用于每个SCHOOL_CODE的整个列。

请务必注意over()子句中的分区和顺序。

SELECT DISTINCT
 FIRST_VALUE(LAST_UPDATE_DATE_TIME) OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS LAST_UPDATE
,FIRST_VALUE(SCHOOL_CODE)           OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS SCHOOL_CODE
,FIRST_VALUE(PERSON_ID)             OVER (PARTITION BY SCHOOL_CODE ORDER BY LAST_UPDATE_DATE_TIME DESC) AS PERSON_ID
FROM SCHOOL_STAFF
WHERE STAFF_TYPE_NAME = 'Principal'
ORDER BY SCHOOL_CODE

返回值:

24-JAN-13   ABE 111222

在大多数情况下,这应该消除了对GROUP BY和子查询的需求。您将需要确保包括DISTINCT。


1
select LAST_UPDATE_DATE_TIME as LAST_UPDATE,
  SCHOOL_CODE,
  PERSON_ID
from SCHOOL_STAFF
WHERE STAFF_TYPE_NAME='Principal'
AND LAST_UPDATE_DATE_TIME = (SELECT MAX(LAST_UPDATE_DATE_TIME)
                            FROM SCHOOL_STAFF s2
                            WHERE PERSON_ID = s2.PERSON_ID)

1
而不是发布代码,您应该尝试解释它如何回答问题;并可能是OP的错误操作。
Max Vernon 2014年
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.