PostgreSQL 源码解读(34)- 查询语句#19(查询优化-表达式预处理#4)

本节简单介绍了PG查询优化表达式预处理中的生成子链接执行计划、使用Param替换上层Vars以及转换表达式为隐式AND格式(implicit-AND format)。

一、主函数

主函数preprocess_expression先前章节也介绍过,在此函数中调用了生成子链接执行计划、使用Param替换上层Vars以及转换表达式为隐式AND格式(implicit-AND format)等相关子函数。
preprocess_expression

 /*
  * preprocess_expression
  *      Do subquery_planner's preprocessing work for an expression,
  *      which can be a targetlist, a WHERE clause (including JOIN/ON
  *      conditions), a HAVING clause, or a few other things.
  */
 static Node *
 preprocess_expression(PlannerInfo *root, Node *expr, int kind)
 {
     //...
 
     /* Expand SubLinks to SubPlans */
     if (root->parse->hasSubLinks)//扩展子链接为子计划
         expr = SS_process_sublinks(root, expr, (kind == EXPRKIND_QUAL));
 
     /*
      * XXX do not insert anything here unless you have grokked the comments in
      * SS_replace_correlation_vars ...
      */
 
     /* Replace uplevel vars with Param nodes (this IS possible in VALUES) */
     if (root->query_level > 1)
         expr = SS_replace_correlation_vars(root, expr);//使用Param节点替换上层的Vars
 
     /*
      * If it's a qual or havingQual, convert it to implicit-AND format. (We
      * don't want to do this before eval_const_expressions, since the latter
      * would be unable to simplify a top-level AND correctly. Also,
      * SS_process_sublinks expects explicit-AND format.)
      */
     if (kind == EXPRKIND_QUAL)//转换为隐式AND格式
         expr = (Node *) make_ands_implicit((Expr *) expr);
 
     return expr;
 }

二、生成子链接执行计划

先前的章节已介绍了上拉子链接的相关处理过程,对于不能上拉的子链接,PG会生成子执行计划.对于会生成常量的子链接,则会把生成的常量记录在Param中,在需要的时候由父查询使用.
例1:以下的子链接,PG会生成子计划,并把子链接的结果物化(Materialize)提升整体性能.

testdb=# explain verbose select * from t_dwxx where dwbh > all (select b.dwbh from t_grxx b);
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Seq Scan on public.t_dwxx  (cost=0.00..1498.00 rows=80 width=474)
   Output: t_dwxx.dwmc, t_dwxx.dwbh, t_dwxx.dwdz
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Materialize  (cost=0.00..17.35 rows=490 width=38)
           Output: b.dwbh
           ->  Seq Scan on public.t_grxx b  (cost=0.00..14.90 rows=490 width=38)
                 Output: b.dwbh
(8 rows)

例2:以下的子链接,PG会把生成的常量记录在Param中(注意生成的参数:$0)

testdb=# explain verbose select * from t_dwxx a where exists (select  max(b.dwbh) from t_grxx b);
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Result  (cost=16.14..27.73 rows=160 width=474)
   Output: a.dwmc, a.dwbh, a.dwdz
   One-Time Filter: $0
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=16.12..16.14 rows=1 width=32)
           Output: max((b.dwbh)::text)
           ->  Seq Scan on public.t_grxx b  (cost=0.00..14.90 rows=490 width=38)
                 Output: b.dwbh, b.grbh, b.xm, b.nl
   ->  Seq Scan on public.t_dwxx a  (cost=16.14..27.73 rows=160 width=474)
         Output: a.dwmc, a.dwbh, a.dwdz
(10 rows)

源代码如下:
SS_process_sublinks

/*
  * Expand SubLinks to SubPlans in the given expression.
  *
  * The isQual argument tells whether or not this expression is a WHERE/HAVING
  * qualifier expression.  If it is, any sublinks appearing at top level need
  * not distinguish FALSE from UNKNOWN return values.
  */
 Node *
 SS_process_sublinks(PlannerInfo *root, Node *expr, bool isQual)
 {
     process_sublinks_context context;
 
     context.root = root;
     context.isTopQual = isQual;
     return process_sublinks_mutator(expr, &context);//调用XX_mutator函数遍历并处理
 }
 
 static Node *
 process_sublinks_mutator(Node *node, process_sublinks_context *context)
 {
     process_sublinks_context locContext;
 
     locContext.root = context->root;
 
     if (node == NULL)
         return NULL;
     if (IsA(node, SubLink))//子链接
     {
         SubLink    *sublink = (SubLink *) node;
         Node       *testexpr;
 
         /*
          * First, recursively process the lefthand-side expressions, if any.
          * They're not top-level anymore.
          */
         locContext.isTopQual = false;
         testexpr = process_sublinks_mutator(sublink->testexpr, &locContext);
 
         /*
          * Now build the SubPlan node and make the expr to return.
          */
         return make_subplan(context->root,
                             (Query *) sublink->subselect,
                             sublink->subLinkType,
                             sublink->subLinkId,
                             testexpr,
                             context->isTopQual);//生成子执行计划,与整体的执行计划类似
     }
 
     /*
      * Don't recurse into the arguments of an outer PHV or aggregate here. Any
      * SubLinks in the arguments have to be dealt with at the outer query
      * level; they'll be handled when build_subplan collects the PHV or Aggref
      * into the arguments to be passed down to the current subplan.
      */
     if (IsA(node, PlaceHolderVar))
     {
         if (((PlaceHolderVar *) node)->phlevelsup > 0)
             return node;
     }
     else if (IsA(node, Aggref))
     {
         if (((Aggref *) node)->agglevelsup > 0)
             return node;
     }
 
     /*
      * We should never see a SubPlan expression in the input (since this is
      * the very routine that creates 'em to begin with).  We shouldn't find
      * ourselves invoked directly on a Query, either.
      */
     Assert(!IsA(node, SubPlan));
     Assert(!IsA(node, AlternativeSubPlan));
     Assert(!IsA(node, Query));
 
     /*
      * Because make_subplan() could return an AND or OR clause, we have to
      * take steps to preserve AND/OR flatness of a qual.  We assume the input
      * has been AND/OR flattened and so we need no recursion here.
      *
      * (Due to the coding here, we will not get called on the List subnodes of
      * an AND; and the input is *not* yet in implicit-AND format.  So no check
      * is needed for a bare List.)
      *
      * Anywhere within the top-level AND/OR clause structure, we can tell
      * make_subplan() that NULL and FALSE are interchangeable.  So isTopQual
      * propagates down in both cases.  (Note that this is unlike the meaning
      * of "top level qual" used in most other places in Postgres.)
      */
     if (and_clause(node))//AND语句
     {
         List       *newargs = NIL;
         ListCell   *l;
 
         /* Still at qual top-level */
         locContext.isTopQual = context->isTopQual;
 
         foreach(l, ((BoolExpr *) node)->args)
         {
             Node       *newarg;
 
             newarg = process_sublinks_mutator(lfirst(l), &locContext);
             if (and_clause(newarg))
                 newargs = list_concat(newargs, ((BoolExpr *) newarg)->args);
             else
                 newargs = lappend(newargs, newarg);
         }
         return (Node *) make_andclause(newargs);
     }
 
     if (or_clause(node))//OR语句
     {
         List       *newargs = NIL;
         ListCell   *l;
 
         /* Still at qual top-level */
         locContext.isTopQual = context->isTopQual;
 
         foreach(l, ((BoolExpr *) node)->args)
         {
             Node       *newarg;
 
             newarg = process_sublinks_mutator(lfirst(l), &locContext);
             if (or_clause(newarg))
                 newargs = list_concat(newargs, ((BoolExpr *) newarg)->args);
             else
                 newargs = lappend(newargs, newarg);
         }
         return (Node *) make_orclause(newargs);
     }
 
     /*
      * If we recurse down through anything other than an AND or OR node, we
      * are definitely not at top qual level anymore.
      */
     locContext.isTopQual = false;
 
     return expression_tree_mutator(node,
                                    process_sublinks_mutator,
                                    (void *) &locContext);
 }

三、使用Param替换上层变量

SQL例子参考"二、生成子链接执行计划"中的例2,这也是使用Param替代Var的一个例子.
源代码如下:

 /*
  * Replace correlation vars (uplevel vars) with Params.
  *
  * Uplevel PlaceHolderVars and aggregates are replaced, too.
  *
  * Note: it is critical that this runs immediately after SS_process_sublinks.
  * Since we do not recurse into the arguments of uplevel PHVs and aggregates,
  * they will get copied to the appropriate subplan args list in the parent
  * query with uplevel vars not replaced by Params, but only adjusted in level
  * (see replace_outer_placeholdervar and replace_outer_agg).  That's exactly
  * what we want for the vars of the parent level --- but if a PHV's or
  * aggregate's argument contains any further-up variables, they have to be
  * replaced with Params in their turn. That will happen when the parent level
  * runs SS_replace_correlation_vars.  Therefore it must do so after expanding
  * its sublinks to subplans.  And we don't want any steps in between, else
  * those steps would never get applied to the argument expressions, either in
  * the parent or the child level.
  *
  * Another fairly tricky thing going on here is the handling of SubLinks in
  * the arguments of uplevel PHVs/aggregates.  Those are not touched inside the
  * intermediate query level, either.  Instead, SS_process_sublinks recurses on
  * them after copying the PHV or Aggref expression into the parent plan level
  * (this is actually taken care of in build_subplan).
  */
 Node *
 SS_replace_correlation_vars(PlannerInfo *root, Node *expr)
 {
     /* No setup needed for tree walk, so away we go */
     //调用XX_mutator遍历处理
     return replace_correlation_vars_mutator(expr, root);
 }
 
 static Node *
 replace_correlation_vars_mutator(Node *node, PlannerInfo *root)
 {
     if (node == NULL)
         return NULL;
     if (IsA(node, Var))//Var
     {
         if (((Var *) node)->varlevelsup > 0)
             return (Node *) replace_outer_var(root, (Var *) node);//使用Param替换
     }
     if (IsA(node, PlaceHolderVar))
     {
         if (((PlaceHolderVar *) node)->phlevelsup > 0)
             return (Node *) replace_outer_placeholdervar(root,
                                                          (PlaceHolderVar *) node);
     }
     if (IsA(node, Aggref))
     {
         if (((Aggref *) node)->agglevelsup > 0)
             return (Node *) replace_outer_agg(root, (Aggref *) node);
     }
     if (IsA(node, GroupingFunc))
     {
         if (((GroupingFunc *) node)->agglevelsup > 0)
             return (Node *) replace_outer_grouping(root, (GroupingFunc *) node);
     }
     return expression_tree_mutator(node,
                                    replace_correlation_vars_mutator,
                                    (void *) root);
 }

 /*
  * Generate a Param node to replace the given Var,
  * which is expected to have varlevelsup > 0 (ie, it is not local).
  */
 static Param *
 replace_outer_var(PlannerInfo *root, Var *var)//构造Param替换Var
 {
     Param      *retval;
     int         i;
 
     Assert(var->varlevelsup > 0 && var->varlevelsup < root->query_level);
 
     /* Find the Var in the appropriate plan_params, or add it if not present */
     i = assign_param_for_var(root, var);
 
     retval = makeNode(Param);
     retval->paramkind = PARAM_EXEC;
     retval->paramid = i;
     retval->paramtype = var->vartype;
     retval->paramtypmod = var->vartypmod;
     retval->paramcollid = var->varcollid;
     retval->location = var->location;
 
     return retval;
 }

四、转换表达式为隐式AND格式

源码如下:

 List *
 make_ands_implicit(Expr *clause)
 {
     /*
      * NB: because the parser sets the qual field to NULL in a query that has
      * no WHERE clause, we must consider a NULL input clause as TRUE, even
      * though one might more reasonably think it FALSE.  Grumble. If this
      * causes trouble, consider changing the parser's behavior.
      */
     if (clause == NULL)//如为NULL,返回空指针
         return NIL;             /* NULL -> NIL list == TRUE */
     else if (and_clause((Node *) clause))//AND语句,直接返回AND中的args参数
         return ((BoolExpr *) clause)->args;
     else if (IsA(clause, Const) &&
              !((Const *) clause)->constisnull &&
              DatumGetBool(((Const *) clause)->constvalue))
         return NIL;             /* 常量TRUE ,返回空指针constant TRUE input -> NIL list */
     else
         return list_make1(clause);//返回List
 }

五、参考资料

subselect.c
clauses.c

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,922评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,591评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,546评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,467评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,553评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,580评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,588评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,334评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,780评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,092评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,270评论 1 344
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,925评论 5 338
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,573评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,194评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,437评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,154评论 2 366
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,127评论 2 352

推荐阅读更多精彩内容