Hacking PostgreSQL 内核系列之三
1、一个分区表的问题
假如我们有一个分区表叫parent,里面只有一列数据,int类型的id。然后它是由三个分区表组成的--child1、child2和child3。他们分别是id从[0, 250),[250,500)以及[-10000,0)。
当我执行这个命令的时候:SELECT * FROM parent ORDER BY id;数据库会有这样的执行方式:
这里我child1和child2都加了索引,但是child3没有。所以是child3上面先进行一个排序,然后index扫描child1和child2,然后再用一个merge append节点来把它们的结果汇集起来。
但是不对啊。我们的分区表是分段分区的。child1里面所有的东西一定都是小于child2的。child3里面的东西一定都是小于其他两个分区的。本来里面的都是已经排好序了(index scan其实就是一种排序),为什么后需要用merge append对已经排好序的结果再排一次呢?merge append本身是使用priority tree来实现的,还是有点开销的。这里我们应该可以用一个简单的append节点来代替。
2、让我们来修改内核吧
首先我们要注意到这个问题是在排序的才会产生的。那么自然而然,我们就需要想办法在优化器中识别出这个状态。
什么样的情况可以增加我们想要的优化呢?我们要从一个分区表里面去排序地选择东西。这个分区表呢,是一个range partition table,而且它的每一个分区都没有子分区(因为那样实现起来太麻烦了)。
那么我们在优化器中该如何判断呢?首先应该想到的是PartitionBoundInfo这个结构体。它描述了分区表中每一个分区的上下限的关系,还有这个分区表的分区类型。
那么应该从哪儿下手呢?对源代码进行一点点搜索,你会发现这个函数:generate_mergeappend_paths。毕竟我们要把一个merge append的node变成一个append的node,从这个函数下手应该是合理的。
let's see some codes.
static void
generate_mergeappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels,
List *all_child_pathkeys,
List *partitioned_rels)
{
ListCell *lcp;
bool is_range_partitioned = false;
List *partition_keys = NIL;
List *partition_keys_desc = NIL;
PartitionBoundInfo bound_info = rel->boundinfo;
if (bound_info->strategy == PARTITION_STRATEGY_RANGE)
{
is_range_partitioned = true;
}
if (is_range_partitioned == true)
{
partition_keys = generate_partition_pathkeys(root, rel, ForwardScanDirection);
partition_keys_desc = generate_partition_pathkeys(root, rel, BackwardScanDirection);
}
foreach(lcp, all_child_pathkeys)
{
List *pathkeys = (List *) lfirst(lcp);
List *startup_subpaths = NIL;
List *total_subpaths = NIL;
bool startup_neq_total = false;
ListCell *lcr;
bool try_append = false;
bool try_append_desc = false;
if (is_range_partitioned)
{
if (pathkeys_contained_in(pathkeys, partition_keys))
{
try_append = true;
}
if (pathkeys_contained_in(pathkeys, partition_keys_desc))
{
try_append_desc = true;
}
}
/* Select the child paths for this ordering... */
foreach(lcr, live_childrels)
{
RelOptInfo *childrel = (RelOptInfo *) lfirst(lcr);
Path *cheapest_startup,
*cheapest_total;
/* Locate the right paths, if they are available. */
cheapest_startup =
get_cheapest_path_for_pathkeys(childrel->pathlist,
pathkeys,
NULL,
STARTUP_COST,
false);
cheapest_total =
get_cheapest_path_for_pathkeys(childrel->pathlist,
pathkeys,
NULL,
TOTAL_COST,
false);
/*
* If we can't find any paths with the right order just use the
* cheapest-total path; we'll have to sort it later.
*/
if (cheapest_startup == NULL || cheapest_total == NULL)
{
cheapest_startup = cheapest_total =
childrel->cheapest_total_path;
/* Assert we do have an unparameterized path for this child */
Assert(cheapest_total->param_info == NULL);
}
/*
* Notice whether we actually have different paths for the
* "cheapest" and "total" cases; frequently there will be no point
* in two create_merge_append_path() calls.
*/
if (cheapest_startup != cheapest_total)
startup_neq_total = true;
accumulate_append_subpath(cheapest_startup,
&startup_subpaths, NULL);
accumulate_append_subpath(cheapest_total,
&total_subpaths, NULL);
}
//这里先不管desc的情况了。先让程序能够跑起来。
if (try_append || try_append_desc)
{
add_path(rel, (Path *)create_append_path(root,
rel,
startup_subpaths,
NIL,
pathkeys,
NULL,
,
false,
partitioned_rels,
-1 ));
if (startup_neq_total)
{
add_path(rel, (Path *)create_append_path(root,
rel,
total_subpaths,
NIL,
pathkeys,
NULL,
,
false,
partitioned_rels,
-1));
}
} else
{
/* ... and build the MergeAppend paths */
add_path(rel, (Path *) create_merge_append_path(root,
rel,
startup_subpaths,
pathkeys,
NULL,
partitioned_rels));
if (startup_neq_total)
add_path(rel, (Path *) create_merge_append_path(root,
rel,
total_subpaths,
pathkeys,
NULL,
partitioned_rels));
}
}
}
相关文章