目录结构
注:提前言明 本文借鉴了以下博主、书籍或网站的内容,其列表如下:
1、参考书籍:《PostgreSQL数据库内核分析》
2、参考书籍:《数据库事务处理的艺术:事务管理与并发控制》
3、PostgreSQL数据库仓库链接,点击前往
4、日本著名PostgreSQL数据库专家 铃木启修 网站主页,点击前往
5、参考书籍:《PostgreSQL指南:内幕探索》,点击前往
6、参考书籍:《事务处理 概念与技术》
7、auto_explain pg官方手册在线文档,点击前往
8、auto_explain 中文手册在线文档,点击前往
1、本文内容全部来源于开源社区 GitHub和以上博主的贡献,本文也免费开源(可能会存在问题,评论区等待大佬们的指正)
2、本文目的:开源共享 抛砖引玉 一起学习
3、本文不提供任何资源 不存在任何交易 与任何组织和机构无关
4、大家可以根据需要自行 复制粘贴以及作为其他个人用途,但是不允许转载 不允许商用 (写作不易,还请见谅 💖)
5、本文内容基于PostgreSQL master源码开发而成
auto_explain — log execution plans of slow queries
- 文章快速说明索引
- 功能实现背景说明
- 功能实现源码解析
文章快速说明索引
学习目标:
做数据库内核开发久了就会有一种 少年得志,年少轻狂 的错觉,然鹅细细一品觉得自己其实不算特别优秀 远远没有达到自己想要的。也许光鲜的表面掩盖了空洞的内在,每每想到于此,皆有夜半临渊如履薄冰之感。为了睡上几个踏实觉,即日起 暂缓其他基于PostgreSQL数据库的兼容功能开发,近段时间 将着重于学习分享Postgres的基础知识和实践内幕。
学习内容:(详见目录)
1、auto_explain — log execution plans of slow queries
学习时间:
2024年10月20日 16:20:13
学习产出:
1、PostgreSQL数据库基础知识回顾 1个
2、CSDN 技术博客 1篇
3、PostgreSQL数据库内核深入学习
注:下面我们所有的学习环境是Centos8+PostgreSQL master +Oracle19C+MySQL8.0
postgres=# select version();version
------------------------------------------------------------------------------------------------------------PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-21), 64-bit
(1 row)postgres=##-----------------------------------------------------------------------------#SQL> select * from v$version; BANNER Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
BANNER_FULL Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production Version 19.17.0.0.0
BANNER_LEGACY Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
CON_ID 0#-----------------------------------------------------------------------------#mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.27 |
+-----------+
1 row in set (0.06 sec)mysql>
功能实现背景说明
当我们向 PostgreSQL 发送一个 SQL 语句时,该语句的执行方式如下图所示:
当使用 EXPLAIN 命令时,PostgreSQL 仅返回估计执行计划,即优化器认为对提供的 SQL 语句最有效的执行计划 (该语句并未真正执行)。另一方面,如果我们运行 EXPLAIN ANALYZE (PostgreSQL 会运行该语句),因此我们将获得实际执行计划,其中还包含执行计划中每个操作的时间信息。但是在调查生产系统上的慢速查询时,我们可能会面临几个挑战:
- 出于安全原因,我们可能不被允许在生产系统上运行查询,因此,在这种情况下,我们不能简单地运行 EXPLAIN ANALYZE 命令来获取实际执行计划
- 即使我们有权运行 EXPLAIN ANALYZE 命令,我们也可能会观察到与客户抱怨的计划不同的计划
auto_explain模块提供了一种方式来自动记录慢速语句的执行计划,而不需要手工运行EXPLAIN。这在大型应用中追踪未被优化的查询时有用。
该模块没有提供 SQL 可访问的函数。要使用它,简单地将它载入服务器。你可以把它载入到一个单独的会话:
LOAD 'auto_explain';
你必须作为超级用户来这样做。更典型的用法是通过在postgresql.conf
的session_preload_libraries
或shared_preload_libraries
参数中包括auto_explain
将它预先载入到某些或者所有会话中。然后你就可以追踪那些出乎意料地慢的查询,而不管它们何时发生。当然为此会付出一些额外的负荷作为代价。
上面是官方文档的描述,但是通常情况下使用它有更明显的原因:
您想知道 PostgreSQL 查询为什么很慢吗?那么
EXPLAIN ANALYZE
是一个很好的起点。正如上面所说,查询可能依赖于其他服务器活动,可能需要一段时间才能运行,并且可能会随时间而变化。
因此,分析慢查询的更好解决方案是获取 PostgreSQL 在运行相关查询时使用的实际执行计划。
于是,如果您想查看最慢查询的实际执行计划,auto_explain就是您需要的工具。
在本人的数据库内核开发中auto_explain
的使用非常多 但是从来没有深入看一下其源码实现。接下来 我们将研究它的作用、如何配置它,如何使用这些日志来加快查询速度,以及它的源码解析。
它的使用和配置都非常简单,官方文档和其他人的博客都已经介绍地很详细,我这里不再赘述。直接上干货:
该插件代码非常简单,只有auto_explain.c
编出的动态库 因此只需加载即可使用!
功能实现源码解析
原理上分为两部分:GUC和HOOK 如下:
// contrib/auto_explain/auto_explain.c/** Module load callback*/
void
_PG_init(void)
{/* Define custom GUC variables. */...MarkGUCPrefixReserved("auto_explain");/* Install hooks. */prev_ExecutorStart = ExecutorStart_hook;ExecutorStart_hook = explain_ExecutorStart;prev_ExecutorRun = ExecutorRun_hook;ExecutorRun_hook = explain_ExecutorRun;prev_ExecutorFinish = ExecutorFinish_hook;ExecutorFinish_hook = explain_ExecutorFinish;prev_ExecutorEnd = ExecutorEnd_hook;ExecutorEnd_hook = explain_ExecutorEnd;
}
参数 | PostgreSQL 默认值 | 建议设置(根据需要) |
---|---|---|
auto_explain.log_min_duration | -1 | 100 |
auto_explain.log_parameter_max_length | -1 | -1 |
auto_explain.log_analyze | Off | On |
auto_explain.log_timing | On (with log_analyze) | On |
auto_explain.log_buffers | Off | On |
auto_explain.log_verbose | Off | On |
auto_explain.log_triggers | Off | Off |
auto_explain.log_nested_statements | Off | Off |
auto_explain.log_settings (v12) | Off | Off |
auto_explain.log_wal (v13) | Off | Off |
auto_explain.log_format | TEXT | JSON |
auto_explain.log_level | LOG | LOG |
auto_explain.sample_rate | 1 | 1 |
这些参数的作用不再赘述 根据需要自行设置即可。接下看一下相关HOOK,如下:
ExecutorStart_hook
A hook called at the beginning of any execution of any query plan
ExecutorStart_hook is a hook which is called at the beginning of any execution of any query plan.ExecutorStart_hook was added in PostgreSQL 8.4.
ExecutorRun_hook
A hook which is called at any plan execution, after ExecutorStart
ExecutorRun_hook is a hook which is called at any plan execution, after ExecutorStart.ExecutorRun_hook was added in PostgreSQL 8.4.
ExecutorFinish_hook
A hook called after the last ExecutorRun call
ExecutorFinish_hook is a hook which is called after the last ExecutorRun call.ExecutorFinish_hook was added in PostgreSQL 9.1.
ExecutorEnd_hook
A hook which is called at the end of execution of any query plan
ExecutorEnd_hook is a hook which is called at the end of execution of any query plan.ExecutorEnd_hook was added in PostgreSQL 8.4.
若是有小伙伴对其他HOOK感兴趣,可以看一下本人之前的博客:
- PostgreSQL的学习心得和知识总结(八十六)|深入理解PostgreSQL数据库HOOK技术及开源hooks文档介绍,点击前往
这段时间偶然看到了HashData的一篇博客,里面扼要地介绍了执行器的操作流程,如下:
在数据库内核层面,执行器的操作流程可以概括为四个关键阶段,它们依次是:
ExecutorStart
、ExecutorRun
、ExecutorFinish
和ExecutorEnd
。这四个阶段在执行器和算子之间建立了紧密的关联,并各自扮演着不同的角色。
注:对此感兴趣的小伙伴可以看一下这篇博客:
- PostgreSQL技术内幕(十六)如何写一个执行器算子?
一、
ExecutorStart
:主要负责初始化各个算子的状态。以SQL语句select * from table order by i limit 2;
为例:
- ExecutorStart会首先创建一个包含所有执行所需信息的执行器状态(Estate)
- 随后,通过InitPlan来初始化Plan State树,为接下来的执行做好准备。在这个过程中,ExecInitNode函数发挥着关键作用,它根据节点的类型(如limit、sort或scan)进行相应的初始化操作。这个过程是层层递进的,确保每个节点或算子的信息和私有状态都被正确设置。
// src/backend/executor/execMain.c/* ----------------------------------------------------------------* ExecutorStart** This routine must be called at the beginning of any execution of any* query plan* 任何查询计划执行开始时都必须调用此例程** Takes a QueryDesc previously created by CreateQueryDesc (which is separate* only because some places use QueryDescs for utility commands). The tupDesc* field of the QueryDesc is filled in to describe the tuples that will be* returned, and the internal fields (estate and planstate) are set up.* 采用先前由 CreateQueryDesc 创建的 QueryDesc(之所以单独创建,是因为有些地方使用 QueryDesc 作为实用程序命令)。* QueryDesc 的 tupDesc 字段被填充以描述将返回的元组,并且内部字段(estate 和 planstate)被设置。** eflags contains flag bits as described in executor.h.** NB: the CurrentMemoryContext when this is called will become the parent* of the per-query context used for this Executor invocation.* 注意:调用此方法时,CurrentMemoryContext 将成为此 Executor 调用所使用的每个查询上下文的父级。** We provide a function hook variable that lets loadable plugins* get control when ExecutorStart is called. Such a plugin would* normally call standard_ExecutorStart().* 我们提供了一个函数钩子变量,让可加载插件在调用 ExecutorStart 时获得控制权。* 这样的插件通常会调用 standard_ExecutorStart()。** ----------------------------------------------------------------*/
void
ExecutorStart(QueryDesc *queryDesc, int eflags)
{/** In some cases (e.g. an EXECUTE statement or an execute message with the* extended query protocol) the query_id won't be reported, so do it now.** Note that it's harmless to report the query_id multiple times, as the* call will be ignored if the top level query_id has already been* reported.*/pgstat_report_query_id(queryDesc->plannedstmt->queryId, false);if (ExecutorStart_hook)(*ExecutorStart_hook) (queryDesc, eflags);elsestandard_ExecutorStart(queryDesc, eflags);
}void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{.../** Build EState, switch into per-query memory context for startup.*/estate = CreateExecutorState();queryDesc->estate = estate;.../** Initialize the plan state tree*/InitPlan(queryDesc, eflags);...
}
详细如下:
然后看一下这里的钩子函数,如下:
/** ExecutorStart hook: start up logging if needed*/
static void
explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
{/** At the beginning of each top-level statement, decide whether we'll* sample this statement. If nested-statement explaining is enabled,* either all nested statements will be explained or none will.* 在每个顶级语句的开头,决定是否要对此语句进行采样。* 如果启用了嵌套语句解释,则将解释所有嵌套语句,或者不解释任何嵌套语句。** When in a parallel worker, we should do nothing, which we can implement* cheaply by pretending we decided not to sample the current statement.* If EXPLAIN is active in the parent session, data will be collected and* reported back to the parent, and it's no business of ours to interfere.* 在并行工作进程中,我们不应该做任何事情,我们可以通过假装决定不对当前语句进行采样来廉价地实现这一点。* 如果 EXPLAIN 在父会话中处于活动状态,则将收集数据并报告回父会话,我们无权干涉。*/if (nesting_level == 0){if (auto_explain_log_min_duration >= 0 && !IsParallelWorker())current_query_sampled = (pg_prng_double(&pg_global_prng_state) < auto_explain_sample_rate); // 这块决定是否要采样elsecurrent_query_sampled = false;}if (auto_explain_enabled()){/* Enable per-node instrumentation iff log_analyze is required. */// 当且仅当需要 log_analyze 时,才启用每个节点的检测。if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0){if (auto_explain_log_timing)queryDesc->instrument_options |= INSTRUMENT_TIMER;elsequeryDesc->instrument_options |= INSTRUMENT_ROWS;if (auto_explain_log_buffers)queryDesc->instrument_options |= INSTRUMENT_BUFFERS;if (auto_explain_log_wal)queryDesc->instrument_options |= INSTRUMENT_WAL;}}if (prev_ExecutorStart)prev_ExecutorStart(queryDesc, eflags);elsestandard_ExecutorStart(queryDesc, eflags);if (auto_explain_enabled()){/** Set up to track total elapsed time in ExecutorRun. Make sure the* space is allocated in the per-query context so it will go away at* ExecutorEnd.* 设置以跟踪 ExecutorRun 中的总耗时。* 确保在每个查询上下文中分配空间,以便它在 ExecutorEnd 时消失。*/if (queryDesc->totaltime == NULL){MemoryContext oldcxt;oldcxt = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);queryDesc->totaltime = InstrAlloc(1, INSTRUMENT_ALL, false);MemoryContextSwitchTo(oldcxt);}}
}#define auto_explain_enabled() \(auto_explain_log_min_duration >= 0 && \(nesting_level == 0 || auto_explain_log_nested_statements) && \current_query_sampled)
小结一下上面函数的处理逻辑:
- 决定是否要采样
- 我们这里默认 auto_explain_enabled 是开启的,那么当启用auto_explain_log_analyze时,才启用每个节点的检测
- 执行其他HOOK或标准ExecutorStart
- #3 完成之后,设置跟踪 ExecutorRun 中的总耗时
注意第4步中的INSTRUMENT_ALL
,如下:
// src/include/executor/instrument.h/* Flag bits included in InstrAlloc's instrument_options bitmask */
typedef enum InstrumentOption
{INSTRUMENT_TIMER = 1 << 0, /* needs timer (and row counts) */INSTRUMENT_BUFFERS = 1 << 1, /* needs buffer usage */INSTRUMENT_ROWS = 1 << 2, /* needs row count */INSTRUMENT_WAL = 1 << 3, /* needs WAL usage */INSTRUMENT_ALL = PG_INT32_MAX
} InstrumentOption;
二、
ExecutorRun
:初始化完成后,执行器进入运行阶段,通过ExecutorRun来实现算子的运行。此阶段类似于一个外循环,不断从下游获取数据,直到数据全部处理完毕。这个过程主要是通过调用不同的访问方法来执行的,每个访问方法都对应一个函数指针。在初始化阶段,这些函数指针已被设置好,并在运行阶段被调用。
/* ----------------------------------------------------------------* ExecutorRun** This is the main routine of the executor module. It accepts* the query descriptor from the traffic cop and executes the* query plan.** ExecutorStart must have been called already.** If direction is NoMovementScanDirection then nothing is done* except to start up/shut down the destination. Otherwise,* we retrieve up to 'count' tuples in the specified direction.* 如果方向为 NoMovementScanDirection,则除了启动/关闭目的地外不执行任何操作。* 否则,我们将在指定方向上检索最多“count”个元组。** Note: count = 0 is interpreted as no portal limit, i.e., run to* completion. Also note that the count limit is only applied to* retrieved tuples, not for instance to those inserted/updated/deleted* by a ModifyTable plan node.* 注意:count = 0 表示没有入口限制,即运行至完成。* 还请注意,计数限制仅适用于检索到的元组,而不适用于由修改表计划节点插入/更新/删除的元组。** There is no return value, but output tuples (if any) are sent to* the destination receiver specified in the QueryDesc; and the number* of tuples processed at the top level can be found in* estate->es_processed. The total number of tuples processed in all* the ExecutorRun calls can be found in estate->es_total_processed.* 没有返回值,但输出元组(如果有)会发送到 QueryDesc 中指定的目标接收器;* 并且可以在 estate->es_processed 中找到在顶层处理的元组数量。* 可以在 estate->es_total_processed 中找到在所有 ExecutorRun 调用中处理的元组总数。** We provide a function hook variable that lets loadable plugins* get control when ExecutorRun is called. Such a plugin would* normally call standard_ExecutorRun().* 我们提供了一个函数钩子变量,让可加载插件在调用 ExecutorRun 时获得控制权。* 这样的插件通常会调用 standard_ExecutorRun()。** ----------------------------------------------------------------*/
void
ExecutorRun(QueryDesc *queryDesc,ScanDirection direction, uint64 count,bool execute_once)
{if (ExecutorRun_hook)(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);elsestandard_ExecutorRun(queryDesc, direction, count, execute_once);
}void
standard_ExecutorRun(QueryDesc *queryDesc,ScanDirection direction, uint64 count, bool execute_once)
{.../* Allow instrumentation of Executor overall runtime */if (queryDesc->totaltime)InstrStartNode(queryDesc->totaltime); // 记录开始时刻的...sendTuples = (operation == CMD_SELECT ||queryDesc->plannedstmt->hasReturning);if (sendTuples)dest->rStartup(dest, operation, queryDesc->tupDesc);.../** run plan*/if (!ScanDirectionIsNoMovement(direction)){if (execute_once && queryDesc->already_executed)elog(ERROR, "can't re-execute query flagged for single execution");queryDesc->already_executed = true;ExecutePlan(estate,queryDesc->planstate,queryDesc->plannedstmt->parallelModeNeeded,operation,sendTuples,count,direction,dest,execute_once);}.../** shutdown tuple receiver, if we started it*/if (sendTuples)dest->rShutdown(dest);if (queryDesc->totaltime)InstrStopNode(queryDesc->totaltime, estate->es_processed); // diff上面 记录差值...
}
注1:关于上面sendTuples
这块的这里不再赘述,有兴趣的小伙伴可以去去查看一下本人之前的博客:
- PostgreSQL的学习心得和知识总结(一百四十四)|深入理解PostgreSQL数据库之sendTuples的实现原理及功能修改,点击前往
注2:上面真正的核心ExecutePlan
函数的处理(循环直到我们处理完计划中适当数量的元组。),如下所示:
接下来看一下此处的钩子函数,如下:
/** ExecutorRun hook: all we need do is track nesting depth*/
static void
explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction,uint64 count, bool execute_once)
{nesting_level++;PG_TRY();{if (prev_ExecutorRun)prev_ExecutorRun(queryDesc, direction, count, execute_once);elsestandard_ExecutorRun(queryDesc, direction, count, execute_once);}PG_FINALLY();{nesting_level--;}PG_END_TRY();
}
这里比较简单:仅需要做的就是跟踪嵌套深度。
三、
ExecutorFinish
:为确保信息的完整性和后续分析的便利性,在ExecutorRun和ExecutorEnd之间,特别引入了ExecutorFinish阶段。在ExecutorFinish阶段,执行器会进行一些统计信息的收集、时间的记录以及相关的清理工作。
/* ----------------------------------------------------------------* ExecutorFinish** This routine must be called after the last ExecutorRun call.* It performs cleanup such as firing AFTER triggers. It is* separate from ExecutorEnd because EXPLAIN ANALYZE needs to* include these actions in the total runtime.* 此例程必须在最后一次 ExecutorRun 调用之后调用。* 它执行清理工作,例如触发 AFTER 触发器。* 它与 ExecutorEnd 是分开的,因为 EXPLAIN ANALYZE 需要将这些操作包含在总运行时中。** We provide a function hook variable that lets loadable plugins* get control when ExecutorFinish is called. Such a plugin would* normally call standard_ExecutorFinish().** ----------------------------------------------------------------*/
void
ExecutorFinish(QueryDesc *queryDesc)
{if (ExecutorFinish_hook)(*ExecutorFinish_hook) (queryDesc);elsestandard_ExecutorFinish(queryDesc);
}void
standard_ExecutorFinish(QueryDesc *queryDesc)
{EState *estate;MemoryContext oldcontext;/* sanity checks */Assert(queryDesc != NULL);estate = queryDesc->estate;Assert(estate != NULL);Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));/* This should be run once and only once per Executor instance */Assert(!estate->es_finished);/* Switch into per-query memory context */oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);/* Allow instrumentation of Executor overall runtime */if (queryDesc->totaltime)InstrStartNode(queryDesc->totaltime);/* Run ModifyTable nodes to completion */// 运行完修改表节点ExecPostprocessPlan(estate);/* Execute queued AFTER triggers, unless told not to */// 执行排队的 AFTER 触发器,除非被告知不要执行if (!(estate->es_top_eflags & EXEC_FLAG_SKIP_TRIGGERS))AfterTriggerEndQuery(estate);if (queryDesc->totaltime)InstrStopNode(queryDesc->totaltime, 0);MemoryContextSwitchTo(oldcontext);estate->es_finished = true;
}
和上面一样,ExecutorFinish 钩子:同样仅需要做的就是跟踪嵌套深度。
四、
ExecutorEnd
:执行器ExecutorEnd阶段,负责逐层结束下游节点的执行。这个过程是通过调用每个节点的结束函数(endplan)来实现的,该函数会识别到具体的节点类型,并调用相应的结束方法。在结束过程中,执行器会销毁每个节点的状态信息,释放资源。
/* ----------------------------------------------------------------* ExecutorEnd** This routine must be called at the end of execution of any* query plan* 必须在任何查询计划执行结束时调用此例程** We provide a function hook variable that lets loadable plugins* get control when ExecutorEnd is called. Such a plugin would* normally call standard_ExecutorEnd().** ----------------------------------------------------------------*/
void
ExecutorEnd(QueryDesc *queryDesc)
{if (ExecutorEnd_hook)(*ExecutorEnd_hook) (queryDesc);elsestandard_ExecutorEnd(queryDesc);
}void
standard_ExecutorEnd(QueryDesc *queryDesc)
{EState *estate;MemoryContext oldcontext;/* sanity checks */Assert(queryDesc != NULL);estate = queryDesc->estate;Assert(estate != NULL);/** Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This* Assert is needed because ExecutorFinish is new as of 9.1, and callers* might forget to call it.*/Assert(estate->es_finished ||(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));/** Switch into per-query memory context to run ExecEndPlan*/oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);ExecEndPlan(queryDesc->planstate, estate);/* do away with our snapshots */UnregisterSnapshot(estate->es_snapshot);UnregisterSnapshot(estate->es_crosscheck_snapshot);/** Must switch out of context before destroying it*/MemoryContextSwitchTo(oldcontext);/** Release EState and per-query memory context. This should release* everything the executor has allocated.*/FreeExecutorState(estate);/* Reset queryDesc fields that no longer point to anything */queryDesc->tupDesc = NULL;queryDesc->estate = NULL;queryDesc->planstate = NULL;queryDesc->totaltime = NULL;
}
以上面SQL为例,我们调试一下这里:
postgres=# \d+ t1Table "public.t1"Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------id | integer | | | | plain | | |
Access method: heappostgres=#
postgres=# explain (analyze, verbose)select * from t1 order by id limit 2;QUERY PLAN
----------------------------------------------------------------------------------------------------------------------Limit (cost=25.00..25.01 rows=2 width=4) (actual time=0.200..0.201 rows=2 loops=1)Output: id-> Sort (cost=25.00..27.50 rows=1000 width=4) (actual time=0.199..0.200 rows=2 loops=1)Output: idSort Key: t1.idSort Method: top-N heapsort Memory: 25kB-> Seq Scan on public.t1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.012..0.105 rows=1000 loops=1)Output: idPlanning Time: 0.057 msExecution Time: 0.213 ms
(10 rows)postgres=#
下面是ExecutorStart
阶段:
ExecInitSeqScan(SeqScan * node, EState * estate, int eflags)
ExecInitNode(Plan * node, EState * estate, int eflags)
ExecInitSort(Sort * node, EState * estate, int eflags)
ExecInitNode(Plan * node, EState * estate, int eflags)
ExecInitLimit(Limit * node, EState * estate, int eflags)
ExecInitNode(Plan * node, EState * estate, int eflags)
InitPlan(QueryDesc * queryDesc, int eflags)
standard_ExecutorStart(QueryDesc * queryDesc, int eflags)
ExecutorStart(QueryDesc * queryDesc, int eflags)
PortalStart(Portal portal, ParamListInfo params, int eflags, Snapshot snapshot)
exec_simple_query(const char * query_string)
...
下面是ExecutorRun
阶段:
ExecSeqScan(PlanState * pstate)
ExecProcNodeFirst(PlanState * node)
ExecProcNode(PlanState * node)
ExecSort(PlanState * pstate)
ExecProcNodeFirst(PlanState * node)
ExecProcNode(PlanState * node)
ExecLimit(PlanState * pstate)
ExecProcNodeFirst(PlanState * node)
ExecProcNode(PlanState * node)
ExecutePlan(EState * estate, PlanState * planstate, _Bool use_parallel_mode, CmdType operation, _Bool sendTuples, uint64 numberTuples, ScanDirection direction, DestReceiver * dest, _Bool execute_once)
standard_ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, uint64 count, _Bool execute_once)
ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, uint64 count, _Bool execute_once)
PortalRunSelect(Portal portal, _Bool forward, long count, DestReceiver * dest)
PortalRun(Portal portal, long count, _Bool isTopLevel, _Bool run_once, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc)
exec_simple_query(const char * query_string)
...
如上SQL是一个比较简单的查询,ExecutorFinish
阶段并未做什么。
下面是ExecutorEnd
阶段:
ExecEndSeqScan(SeqScanState * node)
ExecEndNode(PlanState * node)
ExecEndSort(SortState * node)
ExecEndNode(PlanState * node)
ExecEndLimit(LimitState * node)
ExecEndNode(PlanState * node)
ExecEndPlan(PlanState * planstate, EState * estate)
standard_ExecutorEnd(QueryDesc * queryDesc)
ExecutorEnd(QueryDesc * queryDesc)
PortalCleanup(Portal portal)
PortalDrop(Portal portal, _Bool isTopCommit)
exec_simple_query(const char * query_string)
...
接下来看一下此处的钩子函数,如下:
/** ExecutorEnd hook: log results if needed*/
static void
explain_ExecutorEnd(QueryDesc *queryDesc)
{if (queryDesc->totaltime && auto_explain_enabled()){MemoryContext oldcxt;double msec;/** Make sure we operate in the per-query context, so any cruft will be* discarded later during ExecutorEnd.* 确保我们在每个查询上下文中进行操作,因此任何多余的内容都会在 ExecutorEnd 期间被丢弃。*/oldcxt = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);/** Make sure stats accumulation is done. (Note: it's okay if several* levels of hook all do this.)* 确保统计数据累积已完成。(注意:如果几级钩子都这样做也没关系。)*/InstrEndLoop(queryDesc->totaltime);/* Log plan if duration is exceeded. */msec = queryDesc->totaltime->total * 1000.0;if (msec >= auto_explain_log_min_duration){ExplainState *es = NewExplainState();es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);es->verbose = auto_explain_log_verbose;es->buffers = (es->analyze && auto_explain_log_buffers);es->wal = (es->analyze && auto_explain_log_wal);es->timing = (es->analyze && auto_explain_log_timing);es->summary = es->analyze;/* No support for MEMORY *//* es->memory = false; */es->format = auto_explain_log_format;es->settings = auto_explain_log_settings;ExplainBeginOutput(es);ExplainQueryText(es, queryDesc);ExplainQueryParameters(es, queryDesc->params, auto_explain_log_parameter_max_length);ExplainPrintPlan(es, queryDesc);if (es->analyze && auto_explain_log_triggers)ExplainPrintTriggers(es, queryDesc);if (es->costs)ExplainPrintJITSummary(es, queryDesc);ExplainEndOutput(es);/* Remove last line break */if (es->str->len > 0 && es->str->data[es->str->len - 1] == '\n')es->str->data[--es->str->len] = '\0';/* Fix JSON to output an object */if (auto_explain_log_format == EXPLAIN_FORMAT_JSON){es->str->data[0] = '{';es->str->data[es->str->len - 1] = '}';}/** Note: we rely on the existing logging of context or* debug_query_string to identify just which statement is being* reported. This isn't ideal but trying to do it here would* often result in duplication.*/ereport(auto_explain_log_level,(errmsg("duration: %.3f ms plan:\n%s",msec, es->str->data),errhidestmt(true)));}MemoryContextSwitchTo(oldcxt);}if (prev_ExecutorEnd)prev_ExecutorEnd(queryDesc);elsestandard_ExecutorEnd(queryDesc);
}
小结一下上面函数逻辑,如下:
- 结束此次总统计:将每个周期的统计数据累加为总数
- 如果超出持续时间则记录该计划
- 开辟一个NewExplainState对象,并根据指定进行填充关键bool值;然后构造计划文本
- 根据指定级别进行ereport
- 之后才走ExecutorEnd逻辑