SELECT和WHERE子句中的功能相同


11

初学者问题:

f(x, y)我的数据库表中的x和y两列都有一个昂贵的函数。

我想执行一个查询,该查询将函数的结果作为列提供给我,并对其施加约束,例如

SELECT *, f(x, y) AS func FROM table_name WHERE func < 10;

但是,这不起作用,所以我将不得不写一些类似的东西

SELECT *, f(x, y) AS func FROM table_name WHERE f(x, y) < 10;

这会运行两次昂贵的功能吗?最好的方法是什么?


1
是功能STABLE/ IMMUTABLE还是VOLATILE
埃文·卡罗尔

Answers:


22

让我们创建一个具有副作用的函数,以便我们可以看到执行了多少次:

CREATE OR REPLACE FUNCTION test.this_here(val integer)
    RETURNS numeric
    LANGUAGE plpgsql
AS $function$
BEGIN
    RAISE WARNING 'I am called with %', val;
    RETURN sqrt(val);
END;
$function$;

然后像您这样调用它:

SELECT this_here(i) FROM generate_series(1,10) AS t(i) WHERE this_here(i) < 2;

WARNING:  I am called with 1
WARNING:  I am called with 1
WARNING:  I am called with 2
WARNING:  I am called with 2
WARNING:  I am called with 3
WARNING:  I am called with 3
WARNING:  I am called with 4
WARNING:  I am called with 5
WARNING:  I am called with 6
WARNING:  I am called with 7
WARNING:  I am called with 8
WARNING:  I am called with 9
WARNING:  I am called with 10
    this_here     
──────────────────
                1
  1.4142135623731
 1.73205080756888
(3 rows)

如您所见,该函数至少被调用一次(从WHERE子句中),当条件为true时,再次调用该函数以产生输出。

为了避免第二次执行,您可以按照Edgar的建议进行操作 -即包装查询并过滤结果集:

SELECT * 
  FROM (SELECT this_here(i) AS val FROM generate_series(1,10) AS t(i)) x 
 WHERE x.val < 2;

WARNING:  I am called with 1
... every value only once ...
WARNING:  I am called with 10

要进一步检查它的工作原理,可以去那里pg_stat_user_functions检查calls(假设track_functions设置为“全部”)。

让我们尝试一些没有副作用的东西:

CREATE OR REPLACE FUNCTION test.simple(val numeric)
 RETURNS numeric
 LANGUAGE sql
AS $function$
SELECT sqrt(val);
$function$;

SELECT simple(i) AS v 
  FROM generate_series(1,10) AS t(i)
 WHERE simple(i) < 2;
-- output omitted

SELECT * FROM pg_stat_user_functions WHERE funcname = 'simple';
-- 0 rows

simple()实际上太简单了,因此可以内联,因此它不会出现在视图中。让我们使其无懈可击:

CREATE OR REPLACE FUNCTION test.other_one(val numeric)
 RETURNS numeric
 LANGUAGE sql
AS $function$
SELECT 1; -- to prevent inlining
SELECT sqrt(val);
$function$;

SELECT other_one(i) AS v
  FROM generate_series(1,10) AS t(i)
 WHERE other_one(i) < 2;

SELECT * FROM pg_stat_user_functions ;
 funcid  schemaname  funcname   calls  total_time  self_time 
────────┼────────────┼───────────┼───────┼────────────┼───────────
 124311  test        other_one     13       0.218      0.218

SELECT *
  FROM (SELECT other_one(i) AS v FROM generate_series(1,10) AS t(i)) x 
 WHERE v < 2;

SELECT * FROM pg_stat_user_functions ;
 funcid  schemaname  funcname   calls  total_time  self_time 
────────┼────────────┼───────────┼───────┼────────────┼───────────
 124311  test        other_one     23       0.293      0.293

从外观上看,无论有无副作用,图片都是相同的。

更改other_one()为,IMMUTABLE会使行为(可能令人惊讶)变得更糟,因为在两个查询中它将被调用13次。


是否可以通过函数体内是否存在副作用指令来确定再次调用该函数的决定?是否有可能通过查看查询计划来确定具有相同参数的函数每行是被调用一次还是多次(例如,如果它没有副作用)?
Andriy M,

@AndriyM我可以想象是的,但是目前没有时间与调试器一起玩以查看实际调用的内容。将会增加一些关于内联函数的信息(听起来并不是OP应该期望的)。
dezso

1
@AndriyM,根据:postgresql.org/docs/9.1/static/sql-createfunction.html如果未声明为IMMUTABLE或STABLE,则假定该函数为VOLATILE。VOLATILE表示该功能值即使在一次表扫描中也可以更改,因此无法进行优化。
Lennart '18

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.