digoal
2019-03-31
PostgreSQL , nestloop , 自关联 , ctid join , 物理行号扫描方法 , nestloop Tid Scan
PostgreSQL 12支持ctid(物理行号)的nestloop join。使得表自关联的性能大幅度提升。
以前自关联可能需要使用PK(需要经过INDEX SCAN,以及回表两次),实际上CTID自关联性能更好(直接访问HEAP tuple)。
实际例子性能提升4倍以上。
Support parameterized TidPaths.
Up to now we've not worried much about joins where the join key is a
relation's CTID column, reasoning that storing a table's CTIDs in some
other table would be pretty useless. However, there are use-cases for
this sort of query involving self-joins, so that argument doesn't really
hold water.
This patch allows generating plans for joins on CTID that use a nestloop
with inner TidScan, similar to what we might do with an index on the join
column. This is the most efficient way to join when the outer side of
the nestloop is expected to yield relatively few rows.
This change requires upgrading tidpath.c and the generated TidPaths
to work with RestrictInfos instead of bare qual clauses, but that's
long-postponed technical debt anyway.
Discussion: https://postgr.es/m/[email protected]
postgres=# create table t_to (id int , info text, crt_Time timestamp);
CREATE TABLE
postgres=# insert into t_to select generate_series(1,100000), md5(random()::Text), clock_timestamp();
INSERT 0 100000
postgres=# explain analyze select t1.* from t_to t1 join t_to t2 on (t1.ctid=t2.ctid) where t1.id<100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=17275.14..18419.73 rows=35218 width=44) (actual time=37.711..37.776 rows=99 loops=1)
Merge Cond: (t2.ctid = t1.ctid)
-> Sort (cost=11654.58..11918.72 rows=105655 width=6) (actual time=30.264..30.277 rows=100 loops=1)
Sort Key: t2.ctid
Sort Method: external merge Disk: 1600kB
-> Seq Scan on t_to t2 (cost=0.00..1991.55 rows=105655 width=6) (actual time=0.015..11.637 rows=100000 loops=1)
-> Materialize (cost=5620.56..5796.65 rows=35218 width=50) (actual time=7.438..7.454 rows=99 loops=1)
-> Sort (cost=5620.56..5708.60 rows=35218 width=50) (actual time=7.434..7.440 rows=99 loops=1)
Sort Key: t1.ctid
Sort Method: quicksort Memory: 38kB
-> Seq Scan on t_to t1 (cost=0.00..2255.69 rows=35218 width=50) (actual time=0.014..7.407 rows=99 loops=1)
Filter: (id < 100)
Rows Removed by Filter: 99901
Planning Time: 0.090 ms
Execution Time: 38.255 ms
(15 rows)
Time: 38.747 ms
postgres=# set enable_mergejoin=off;
SET
Time: 0.212 ms
postgres=# explain analyze select t1.* from t_to t1 join t_to t2 on (t1.ctid=t2.ctid) where t1.id<100;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..70363912.22 rows=35218 width=44) (actual time=0.028..1539.142 rows=99 loops=1)
Join Filter: (t1.ctid = t2.ctid)
Rows Removed by Join Filter: 9899901
-> Seq Scan on t_to t1 (cost=0.00..2255.69 rows=35218 width=50) (actual time=0.018..7.312 rows=99 loops=1)
Filter: (id < 100)
Rows Removed by Filter: 99901
-> Materialize (cost=0.00..2932.82 rows=105655 width=6) (actual time=0.001..8.649 rows=100000 loops=99)
-> Seq Scan on t_to t2 (cost=0.00..1991.55 rows=105655 width=6) (actual time=0.003..11.377 rows=100000 loops=1)
Planning Time: 0.083 ms
Execution Time: 1539.698 ms
(10 rows)
Time: 1540.160 ms (00:01.540)
在没有任何其他索引的情况下,通过CTID物理行号自关联,速度杠杠的。
postgres=# explain analyze select t1.* from t_to t1 join t_to t2 on (t1.ctid=t2.ctid) where t1.id<100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2297.25 rows=100 width=45) (actual time=0.017..8.443 rows=99 loops=1)
-> Seq Scan on t_to t1 (cost=0.00..2185.00 rows=100 width=51) (actual time=0.013..8.375 rows=99 loops=1)
Filter: (id < 100)
Rows Removed by Filter: 99901
-> Tid Scan on t_to t2 (cost=0.00..1.11 rows=1 width=6) (actual time=0.000..0.000 rows=1 loops=99)
TID Cond: (ctid = t1.ctid)
Planning Time: 0.119 ms
Execution Time: 8.485 ms
(8 rows)
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.