为什么这个LEFT JOIN的表现比LEFT JOIN LATERAL差很多？

我有以下表格（来自Sakila数据库）：

电影：film_id是pkey
演员：actor_id是pkey
film_actor：film_id和actor_id是影片/演员的键

我正在选择一部特定的电影。对于这部电影，我还希望所有演员都参与该电影。我对此有两个查询：一个带有a LEFT JOIN和一个带有a LEFT JOIN LATERAL。

select film.film_id, film.title, a.actors
from   film
left join
  (         
       select     film_actor.film_id, array_agg(first_name) as actors
       from       actor
       inner join film_actor using(actor_id)
       group by   film_actor.film_id
  ) as a
on       a.film_id = film.film_id
where    film.title = 'ACADEMY DINOSAUR'
order by film.title;

select film.film_id, film.title, a.actors
from   film
left join lateral
  (
       select     array_agg(first_name) as actors
       from       actor
       inner join film_actor using(actor_id)
       where      film_actor.film_id = film.film_id
  ) as a
on       true
where    film.title = 'ACADEMY DINOSAUR'
order by film.title;

比较查询计划时，第一个查询的效果比第二个查询差（20倍）：

 Merge Left Join  (cost=507.20..573.11 rows=1 width=51) (actual time=15.087..15.089 rows=1 loops=1)
   Merge Cond: (film.film_id = film_actor.film_id)
   ->  Sort  (cost=8.30..8.31 rows=1 width=19) (actual time=0.075..0.075 rows=1 loops=1)
     Sort Key: film.film_id
     Sort Method: quicksort  Memory: 25kB
     ->  Index Scan using idx_title on film  (cost=0.28..8.29 rows=1 width=19) (actual time=0.044..0.058 rows=1 loops=1)
           Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
   ->  GroupAggregate  (cost=498.90..552.33 rows=997 width=34) (actual time=15.004..15.004 rows=1 loops=1)
     Group Key: film_actor.film_id
     ->  Sort  (cost=498.90..512.55 rows=5462 width=8) (actual time=14.934..14.937 rows=11 loops=1)
           Sort Key: film_actor.film_id
           Sort Method: quicksort  Memory: 449kB
           ->  Hash Join  (cost=6.50..159.84 rows=5462 width=8) (actual time=0.355..8.359 rows=5462 loops=1)
             Hash Cond: (film_actor.actor_id = actor.actor_id)
             ->  Seq Scan on film_actor  (cost=0.00..84.62 rows=5462 width=4) (actual time=0.035..2.205 rows=5462 loops=1)
             ->  Hash  (cost=4.00..4.00 rows=200 width=10) (actual time=0.303..0.303 rows=200 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 17kB
               ->  Seq Scan on actor  (cost=0.00..4.00 rows=200 width=10) (actual time=0.027..0.143 rows=200 loops=1)
 Planning time: 1.495 ms
 Execution time: 15.426 ms

 Nested Loop Left Join  (cost=25.11..33.16 rows=1 width=51) (actual time=0.849..0.854 rows=1 loops=1)
   ->  Index Scan using idx_title on film  (cost=0.28..8.29 rows=1 width=19) (actual time=0.045..0.048 rows=1 loops=1)
     Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
   ->  Aggregate  (cost=24.84..24.85 rows=1 width=32) (actual time=0.797..0.797 rows=1 loops=1)
     ->  Hash Join  (cost=10.82..24.82 rows=5 width=6) (actual time=0.672..0.764 rows=10 loops=1)
           Hash Cond: (film_actor.actor_id = actor.actor_id)
           ->  Bitmap Heap Scan on film_actor  (cost=4.32..18.26 rows=5 width=2) (actual time=0.072..0.150 rows=10 loops=1)
             Recheck Cond: (film_id = film.film_id)
             Heap Blocks: exact=10
             ->  Bitmap Index Scan on idx_fk_film_id  (cost=0.00..4.32 rows=5 width=0) (actual time=0.041..0.041 rows=10 loops=1)
               Index Cond: (film_id = film.film_id)
           ->  Hash  (cost=4.00..4.00 rows=200 width=10) (actual time=0.561..0.561 rows=200 loops=1)
             Buckets: 1024  Batches: 1  Memory Usage: 17kB
             ->  Seq Scan on actor  (cost=0.00..4.00 rows=200 width=10) (actual time=0.039..0.275 rows=200 loops=1)
 Planning time: 1.722 ms
 Execution time: 1.087 ms

为什么是这样？我想学习对此进行推理，因此我可以了解正在发生的事情，并可以预测当数据大小增加时查询将如何运行，以及计划程序在特定条件下将做出哪些决定。

我的想法：在第一个LEFT JOIN查询中，好像对数据库中的所有电影都执行了子查询，而没有考虑到外部查询中我们只对一部特定电影感兴趣的过滤。为什么计划者无法在子查询中拥有该知识？

在LEFT JOIN LATERAL查询中，我们或多或少地“向下推动”了向下过滤。因此，这里没有出现我们在第一个查询中遇到的问题，因此性能更好。

我想我主要是在寻找经验法则，一般智慧... ...因此，策划人的这种魔力成为了第二天性-如果这是有道理的。

更新（1）

重写LEFT JOIN以下代码也可以提供更好的性能（略优于LEFT JOIN LATERAL）：

select film.film_id, film.title, array_agg(a.first_name) as actors
from   film
left join
  (         
       select     film_actor.film_id, actor.first_name
       from       actor
       inner join film_actor using(actor_id)
  ) as a
on       a.film_id = film.film_id
where    film.title = 'ACADEMY DINOSAUR'
group by film.film_id
order by film.title;

 GroupAggregate  (cost=29.44..29.49 rows=1 width=51) (actual time=0.470..0.471 rows=1 loops=1)
   Group Key: film.film_id
   ->  Sort  (cost=29.44..29.45 rows=5 width=25) (actual time=0.428..0.430 rows=10 loops=1)
     Sort Key: film.film_id
     Sort Method: quicksort  Memory: 25kB
     ->  Nested Loop Left Join  (cost=4.74..29.38 rows=5 width=25) (actual time=0.149..0.386 rows=10 loops=1)
           ->  Index Scan using idx_title on film  (cost=0.28..8.29 rows=1 width=19) (actual time=0.056..0.057 rows=1 loops=1)
             Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
           ->  Nested Loop  (cost=4.47..19.09 rows=200 width=8) (actual time=0.087..0.316 rows=10 loops=1)
             ->  Bitmap Heap Scan on film_actor  (cost=4.32..18.26 rows=5 width=4) (actual time=0.052..0.089 rows=10 loops=1)
               Recheck Cond: (film_id = film.film_id)
               Heap Blocks: exact=10
               ->  Bitmap Index Scan on idx_fk_film_id  (cost=0.00..4.32 rows=5 width=0) (actual time=0.035..0.035 rows=10 loops=1)
                 Index Cond: (film_id = film.film_id)
             ->  Index Scan using actor_pkey on actor  (cost=0.14..0.17 rows=1 width=10) (actual time=0.011..0.011 rows=1 loops=10)
               Index Cond: (actor_id = film_actor.actor_id)
 Planning time: 1.833 ms
 Execution time: 0.706 ms

我们如何对此进行推理？

更新（2）

我继续进行一些实验，我认为一个有趣的经验法则是：尽可能高/晚地应用聚合函数。更新（1）中的查询可能执行得更好，因为我们在外部查询中进行聚合，而不再在内部查询中进行聚合。

如果我们将LEFT JOIN LATERAL以上内容重写为以下内容，似乎同样适用：

select film.film_id, film.title, array_agg(a.first_name) as actors
from   film
left join lateral
  (
       select     actor.first_name
       from       actor
       inner join film_actor using(actor_id)
       where      film_actor.film_id = film.film_id
  ) as a
on       true
where    film.title = 'ACADEMY DINOSAUR'
group by film.film_id
order by film.title;

 GroupAggregate  (cost=29.44..29.49 rows=1 width=51) (actual time=0.088..0.088 rows=1 loops=1)
   Group Key: film.film_id
   ->  Sort  (cost=29.44..29.45 rows=5 width=25) (actual time=0.076..0.077 rows=10 loops=1)
     Sort Key: film.film_id
     Sort Method: quicksort  Memory: 25kB
     ->  Nested Loop Left Join  (cost=4.74..29.38 rows=5 width=25) (actual time=0.031..0.066 rows=10 loops=1)
           ->  Index Scan using idx_title on film  (cost=0.28..8.29 rows=1 width=19) (actual time=0.010..0.010 rows=1 loops=1)
             Index Cond: ((title)::text = 'ACADEMY DINOSAUR'::text)
           ->  Nested Loop  (cost=4.47..19.09 rows=200 width=8) (actual time=0.019..0.052 rows=10 loops=1)
             ->  Bitmap Heap Scan on film_actor  (cost=4.32..18.26 rows=5 width=4) (actual time=0.013..0.024 rows=10 loops=1)
               Recheck Cond: (film_id = film.film_id)
               Heap Blocks: exact=10
               ->  Bitmap Index Scan on idx_fk_film_id  (cost=0.00..4.32 rows=5 width=0) (actual time=0.007..0.007 rows=10 loops=1)
                 Index Cond: (film_id = film.film_id)
             ->  Index Scan using actor_pkey on actor  (cost=0.14..0.17 rows=1 width=10) (actual time=0.002..0.002 rows=1 loops=10)
               Index Cond: (actor_id = film_actor.actor_id)
 Planning time: 0.440 ms
 Execution time: 0.136 ms

在这里，我们array_agg()向上移动。如您所见，该计划也比原始计划更好LEFT JOIN LATERAL。

就是说，我不确定这种自行发明的经验法则（尽可能将聚合函数应用到高位/后位）在其他情况下是否正确。

附加信息

小提琴：https ://dbfiddle.uk/ ? rdbms = postgres_10 & fiddle = 4ec4f2fffd969d9e4b949bb2ca765ffb

版本：x86_64-pc-linux-musl上的PostgreSQL 10.4，由gcc（Alpine 6.4.0）6.4.0，64位编译

环境：泊坞窗：docker run -e POSTGRES_PASSWORD=sakila -p 5432:5432 -d frantiseks/postgres-sakila。请注意，Docker Hub上的映像已过时，因此我首先build -t frantiseks/postgres-sakila在本地进行了构建：克隆了git存储库之后。

表定义：

电影

 film_id              | integer                     | not null default nextval('film_film_id_seq'::regclass)
 title                | character varying(255)      | not null

 Indexes:
    "film_pkey" PRIMARY KEY, btree (film_id)
    "idx_title" btree (title)

 Referenced by:
    TABLE "film_actor" CONSTRAINT "film_actor_film_id_fkey" FOREIGN KEY (film_id) REFERENCES film(film_id) ON UPDATE CASCADE ON DELETE RESTRICT

演员

 actor_id    | integer                     | not null default nextval('actor_actor_id_seq'::regclass)
 first_name  | character varying(45)       | not null

 Indexes:
    "actor_pkey" PRIMARY KEY, btree (actor_id)

 Referenced by:
    TABLE "film_actor" CONSTRAINT "film_actor_actor_id_fkey" FOREIGN KEY (actor_id) REFERENCES actor(actor_id) ON UPDATE CASCADE ON DELETE RESTRICT

电影演员

 actor_id    | smallint                    | not null
 film_id     | smallint                    | not null

 Indexes:
    "film_actor_pkey" PRIMARY KEY, btree (actor_id, film_id)
    "idx_fk_film_id" btree (film_id)
 Foreign-key constraints:
    "film_actor_actor_id_fkey" FOREIGN KEY (actor_id) REFERENCES actor(actor_id) ON UPDATE CASCADE ON DELETE RESTRICT
    "film_actor_film_id_fkey" FOREIGN KEY (film_id) REFERENCES film(film_id) ON UPDATE CASCADE ON DELETE RESTRICT

数据：来自Sakila示例数据库。这个问题不是现实生活中的情况，我主要将此数据库用作学习样本数据库。几个月前，我已经被介绍给SQL，并且我正试图扩展我的知识。它具有以下分布：

select count(*) from film: 1000
select count(*) from actor: 200
select avg(a) from (select film_id, count(actor_id) a from film_actor group by film_id) a: 5.47

— 果冻奥恩斯
source

还有一件事：所有重要信息都应纳入问题（包括您的小提琴链接）。没人会在以后阅读所有评论（或者一定会被某个非常有能力的主持人删除）。

— 欧文·布兰德斯特

小提琴被添加到问题！

— 果冻Orns

测试设置

您在小提琴中设置的原始设置有待改进。我一直在询问您的设置是有原因的。

您在这些索引上有film_actor：
```
"film_actor_pkey" PRIMARY KEY, btree (actor_id, film_id)  
"idx_fk_film_id" btree (film_id)
```
这已经很有帮助了。但是为了最好地支持您的特定查询，您应按此顺序在，列上设置多(film_id, actor_id)列索引。实际的解决方案：以此idx_fk_film_id索引代替(film_id, actor_id)-或(film_id, actor_id)出于此测试的目的创建PK ，就像我下面所做的那样。看到：
- 复合索引对第一字段的查询是否也有用？
在只读状态（或大多数情况下，或者通常在VACUUM可以满足写活动的情况下），打开索引(title, film_id)以允许仅索引扫描也有帮助。我的测试用例现已针对读取性能进行了高度优化。
film.film_id（integer）和film_actor.film_id（smallint）之间的类型不匹配。虽然这样做有效，但会使查询变慢，并可能导致各种复杂情况。也使FK约束更加昂贵。如果可以避免，切勿这样做。如果你不知道，挑integer了smallint。虽然每个字段smallint 可以节省2个字节（通常由对齐填充占用），但是与相比，存在更多的复杂性integer。
为了优化测试本身的性能，请在批量插入大量行之后创建索引和约束。将元组增量添加到现有索引要比在所有行都存在的情况下从头开始创建元组要慢得多。

与此测试无关：

独立的序列加列默认值，而不是更简单，更可靠的serial（或IDENTITY）列。别。
- 自动增量表列
timestamp without timestamp通常是用于像一列不可靠的last_update。使用timestamptz代替。并请注意，严格来说，列的默认值并不包含“最新更新”。
中的length修饰符character varying(255)表示测试用例不适合Postgres使用，因为此处的奇数长度毫无意义。（或者作者毫无头绪。）

考虑小提琴中经过审计的测试用例：

db <> fiddle 这里 -在您的小提琴上构建，优化并添加查询。

有关：

如何在PostgreSQL中实现多对多关系？

具有1000部电影和200名演员的测试设置的有效性有限。最有效的查询花费<0.2毫秒。计划时间大于执行时间。具有100k或更多行的测试将更能说明问题。

为什么只检索作者的名字？检索多列后，情况已经稍有不同。

ORDER BY title使用过滤单个标题时没有任何意义WHERE title = 'ACADEMY DINOSAUR'。也许ORDER BY film_id吧？

对于整个运行时间，请使用EXPLAIN (ANALYZE, TIMING OFF)子定时开销来减少（可能引起误导）噪声。

回答

很难形成简单的经验法则，因为总体性能取决于许多因素。非常基本的准则：

汇总子表中的所有行将减少开销，但仅在您实际需要所有行（或很大一部分）时才需要付费。
对于选择几行（您的测试！），不同的查询技术会产生更好的结果。那就LATERAL来了。它带来了更多的开销，但仅从子表中读取所需的行。如果只需要（很小）一部分就可以赢得大胜利。

对于您的特定测试用例，我还将在LATERAL子查询中测试一个ARRAY构造函数：

SELECT f.film_id, f.title, a.actors
FROM   film
LEFT   JOIN LATERAL (
   SELECT ARRAY (
      SELECT a.first_name
      FROM   film_actor fa
      JOIN   actor a USING (actor_id)
      WHERE  fa.film_id = f.film_id
      ) AS actors
   ) a ON true
WHERE  f.title = 'ACADEMY DINOSAUR';
-- ORDER  BY f.title; -- redundant while we filter for a single title

虽然仅在横向子查询中聚合单个数组，但简单的ARRAY构造函数的性能要优于聚合函数array_agg()。看到：

为什么array_agg（）比非聚合ARRAY（）构造函数慢？

或使用一个低关联子查询作为简单情况：

SELECT f.film_id, f.title
     , ARRAY (SELECT a.first_name
              FROM   film_actor fa
              JOIN   actor a USING (actor_id)
              WHERE  fa.film_id = f.film_id) AS actors
FROM   film f
WHERE  f.title = 'ACADEMY DINOSAUR';

或者，基本上，只有2倍LEFT JOIN然后合计：

SELECT f.film_id, f.title, array_agg(a.first_name) AS actors
FROM   film f
LEFT   JOIN film_actor fa USING (film_id)
LEFT   JOIN actor a USING (actor_id)
WHERE  f.title = 'ACADEMY DINOSAUR'
GROUP  BY f.film_id;

在我更新的小提琴（计划+执行时间）中，这三个似乎最快。

首次尝试（仅稍作修改）通常最快地检索全部或大多数电影，但对于少量选择却不是：

SELECT f.film_id, f.title, a.actors
FROM   film f
LEFT   JOIN (         
   SELECT fa.film_id, array_agg(first_name) AS actors
   FROM   actor
   JOIN   film_actor fa USING (actor_id)
   GROUP  by fa.film_id
   ) a USING (film_id)
WHERE  f.title = 'ACADEMY DINOSAUR';  -- not good for a single (or few) films!

基数更大的测试将更加具有启发性。而且不要轻易概括结果，总性能有很多因素。

— 欧文·布兰德斯特
source