【Postgres-XL】并行查询之谜

2022-05-06 00:00:00 查询功能并行多个增强

单机的postgresql并发查询

开启了并发查询，查看查询计划

Gather  (cost=1.65..1419.84 rows=325 width=452) (actual time=50.680..218.201 rows=36004 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Append  (cost=1.65..1419.84 rows=137 width=452) (actual time=6.576..155.120 rows=12001 loops=3)
        ->  Parallel Bitmap Heap Scan on china_roads_partition_0  (cost=1.65..3.17 rows=1 width=488) (actual time=0.288..0.288 rows= loops=3)
        ->  Parallel Bitmap Heap Scan on china_roads_partition_1  (cost=1.65..3.17 rows=1 width=488) (actual time=0.288..0.288 rows= loops=3)
        ->  Parallel Bitmap Heap Scan on china_roads_partition_2  (cost=1.65..3.17 rows=1 width=488) (actual time=0.288..0.288 rows= loops=3)

在命令行查看进程，可以看到新建的worker，是并行查询无疑

postgres  4523  103  0.0 3468548 14856 ?       Rs   14:47   0:03 postgres: bgworker: parallel worker for PID 4512   
postgres  4524  104  0.0 3468684 14468 ?       Rs   14:47   0:03 postgres: bgworker: parallel worker for PID 4512

可是相同的SQL到PGXL执行，却没有并发

Postgres-XL并发
可分为2层并行
1、datanode并行查询
2、每个datanode创建的多个worker并行查询（可是实践证明这层没有开启）

PGXL的Coordinator、DataNode都是postgresql实例，我保证了postgresql.conf里的并行查询相关的参数一致

Coordinator的查询计划
协调器Coordinator处理来自应用程序的SQL语句，并确定应涉及哪个Datanode并为每个Datanode生成本地SQL语句，然后Remote

postgres 32338 16450  0 11:23 ?        00:00:00 postgres: postgres postgres 1xx.16.1.3(61885) SELECT
postgres 32524 16558 12 11:38 ?        00:00:03 postgres: postgres postgres 1xx.16.1.10(54738) REMOTE SUBPLAN (coord1:32338) (C:coord1:32338)
postgres 32526 16557 11 11:38 ?        00:00:02 postgres: postgres postgres 1xx.16.1.10(55352) REMOTE SUBPLAN (coord1:32338) (C:coord1:32338)
postgres 32528 16450  0 11:38 ?        00:00:00 postgres: postgres postgres 1xx.16.1.3(62120) idle

但是datanode并没有启动并行查询的多个worker进程！
Postgres-XL-10r1难道还没有好好地利用Postgresql的并行查询？

翻阅资料发现
Postgres-XL 10r1是继Postgres-XL 9.5r1之后的个主要版本。因此，此版本包含PostgreSQL 9.6和10版本中的大多数主要增强功能。这是此类增强功能的简短列表，但除非另有说明，否则所有其他增强功能也适用。

E.2.3.1。PostgreSQL 10的主要增强功能

声明式表分区

改进的查询并行性

总体性能显着改善

改善监控

E.2.3.2。PostgreSQL 9.6的主要增强功能

并行执行顺序扫描，联接和聚合

避免在真空冻结操作期间不必要地扫描页面

大幅提升性能，尤其是在多CPU套接字服务器上的可扩展性方面

全文搜索现在可以搜索短语（多个相邻单词）

试试哈希联接（Hash Join），果然开启了并行查询

Gather在Coordinator上，Coordinator通过增加连接Remote，导致datanode增加postgres进程进行并行查询

单机Postgresql则是本机Gather，通过增加worker进程进行并行查询

postgres 14390  4176  0 09:33 ?        00:00:03 postgres: postgres postgres 1xx.16.1.3(55061) idle
postgres 14392  4284  0 09:33 ?        00:00:02 postgres: postgres postgres 1xx.16.1.1x(36192) idle
postgres 14394  4283  0 09:33 ?        00:00:01 postgres: postgres postgres 1xx.16.1.1x(36806) idle
postgres 14448  4176  0 09:39 ?        00:00:00 postgres: postgres postgres 1xx.16.1.3(55281) idle
postgres 14514  4176  0 09:39 ?        00:00:00 postgres: postgres postgres 1xx.16.1.3(55283) idle
postgres 14697  4176  0 09:39 ?        00:00:00 postgres: postgres postgres 1xx.16.1.3(55285) idle
postgres 15245  4284  0 09:41 ?        00:00:01 postgres: postgres postgres 1xx.16.1.1x(36904) idle
postgres 15247  4283  0 09:41 ?        00:00:01 postgres: postgres postgres 1xx.16.1.1x(37518) idle
postgres 15249  4284  0 09:41 ?        00:00:01 postgres: postgres postgres 1xx.16.1.1x(36912) idle
postgres 15251  4283  0 09:41 ?        00:00:01 postgres: postgres postgres 1xx.16.1.1x(37526) idle

再试试聚合，额，没有并行

————————————————

原文链接：https://blog.csdn.net/qq_42158942/article/details/109748140

相关文章