Q2: way to measure page size
PostgreSQL提供了许多数据库对象大小函数。我在此查询中打包了最有趣的内容,并在底部添加了一些统计信息访问功能。(附加模块pgstattuple提供了更多有用的功能。)
这将表明使用不同的方法来测量“行的大小”会导致非常不同的结果。完全取决于您要测量的内容。
此查询需要Postgres 9.3或更高版本。对于旧版本,请参见下文。
VALUES
在LATERAL
子查询中使用表达式,以避免拼写出每一行的计算。
将public.tbl
(两次)替换为您可以选择的,由模式限定的表名,以获得有关行大小的已收集统计信息的紧凑视图。您可以将其包装到plpgsql函数中以供重复使用,将表名作为参数输入并使用EXECUTE
...
SELECT l.metric, l.nr AS "bytes/ct"
, CASE WHEN is_size THEN pg_size_pretty(nr) END AS bytes_pretty
, CASE WHEN is_size THEN nr / NULLIF(x.ct, 0) END AS bytes_per_row
FROM (
SELECT min(tableoid) AS tbl -- = 'public.tbl'::regclass::oid
, count(*) AS ct
, sum(length(t::text)) AS txt_len -- length in characters
FROM public.tbl t -- provide table name *once*
) x
, LATERAL (
VALUES
(true , 'core_relation_size' , pg_relation_size(tbl))
, (true , 'visibility_map' , pg_relation_size(tbl, 'vm'))
, (true , 'free_space_map' , pg_relation_size(tbl, 'fsm'))
, (true , 'table_size_incl_toast' , pg_table_size(tbl))
, (true , 'indexes_size' , pg_indexes_size(tbl))
, (true , 'total_size_incl_toast_and_indexes', pg_total_relation_size(tbl))
, (true , 'live_rows_in_text_representation' , txt_len)
, (false, '------------------------------' , NULL)
, (false, 'row_count' , ct)
, (false, 'live_tuples' , pg_stat_get_live_tuples(tbl))
, (false, 'dead_tuples' , pg_stat_get_dead_tuples(tbl))
) l(is_size, metric, nr);
结果:
指标| 字节/ ct | bytes_pretty | bytes_per_row
----------------------------------- + ---------- + --- ----------- + ---------------
core_relation_size | 44138496 | 42 MB | 91
能见度地图| 0 | 0字节| 0
free_space_map | 32768 | 32 kB | 0
table_size_incl_toast | 44179456 | 42 MB | 91
index_size | 33128448 | 32 MB | 68
total_size_incl_toast_and_indexes | 77307904 | 74 MB | 159
live_rows_in_text_representation | 29987360 | 29 MB | 62
------------------------------ | | |
row_count | 483424 | |
live_tuples | 483424 | |
dead_tuples | 2677 | |
对于旧版本(Postgres 9.2或更低版本):
WITH x AS (
SELECT count(*) AS ct
, sum(length(t::text)) AS txt_len -- length in characters
, 'public.tbl'::regclass AS tbl -- provide table name as string
FROM public.tbl t -- provide table name as name
), y AS (
SELECT ARRAY [pg_relation_size(tbl)
, pg_relation_size(tbl, 'vm')
, pg_relation_size(tbl, 'fsm')
, pg_table_size(tbl)
, pg_indexes_size(tbl)
, pg_total_relation_size(tbl)
, txt_len
] AS val
, ARRAY ['core_relation_size'
, 'visibility_map'
, 'free_space_map'
, 'table_size_incl_toast'
, 'indexes_size'
, 'total_size_incl_toast_and_indexes'
, 'live_rows_in_text_representation'
] AS name
FROM x
)
SELECT unnest(name) AS metric
, unnest(val) AS "bytes/ct"
, pg_size_pretty(unnest(val)) AS bytes_pretty
, unnest(val) / NULLIF(ct, 0) AS bytes_per_row
FROM x, y
UNION ALL SELECT '------------------------------', NULL, NULL, NULL
UNION ALL SELECT 'row_count', ct, NULL, NULL FROM x
UNION ALL SELECT 'live_tuples', pg_stat_get_live_tuples(tbl), NULL, NULL FROM x
UNION ALL SELECT 'dead_tuples', pg_stat_get_dead_tuples(tbl), NULL, NULL FROM x;
结果相同。
Q1: anything inefficient?
您可以优化列顺序以节省每行一些字节,当前浪费在对齐填充上:
integer | not null default nextval('core_page_id_seq'::regclass)
integer | not null default 0
character varying(255) | not null
character varying(64) | not null
text | default '{}'::text
character varying(255) |
text | default '{}'::text
text |
timestamp with time zone |
timestamp with time zone |
integer |
integer |
每行节省8到18个字节。我称它为“俄罗斯方块”。细节:
同时考虑:
length(*)
而不仅仅是length(field)
?我知道这不是字符,但我只需要一个近似值。