解读ClickHouse日志中的SQL查询计划

截至目前,ClickHouse并未在正式版本中提供查看查询计划的原生EXPLAIN语法。虽然GitHub上对应的Pull Request已经实现了,但是还没有发布(注:最近已经在20.6.3.28版本中发布了,what a coincidence),需要自行编译新版源码才能享受到便利。不过,我们可以通过ClickHouse的日志间接地读出每条SQL的查询计划,借助clickhouse-client,命令形式如下:

clickhouse-client -h <host> --port <port> --password <pass> --send_logs_level=trace <<< "
// SQL statement here
" > /dev/null

其中,send_logs_level参数指定日志等级为trace,<<<将SQL语句重定向至clickhouse-client进行查询,> /dev/null将查询结果重定向到空设备吞掉,以便观察日志。

话休絮烦,看一个简单的示例。

~ clickhouse-client -h localhost --port 9070 --password ck2020 --send_logs_level=trace <<< "
SELECT event_type,category_id,merchandise_id FROM ods.analytics_access_log
WHERE ts_date = '2020-08-09' AND site_id = 10037
" > /dev/null

[bigdata-ck-test001] 2020.08.09 21:45:06.872889 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> executeQuery: (from 127.0.0.1:55270) SELECT event_type, category_id, merchandise_id FROM ods.analytics_access_log WHERE (ts_date = '2020-08-09') AND (site_id = 10037)
[bigdata-ck-test001] 2020.08.09 21:45:06.873151 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "(site_id = 10037) AND (ts_date = '2020-08-09')" moved to PREWHERE
[bigdata-ck-test001] 2020.08.09 21:45:06.873577 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> ods.analytics_access_log (SelectExecutor): Key condition: (column 1 in [10037, 10037]), unknown, and
[bigdata-ck-test001] 2020.08.09 21:45:06.873596 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> ods.analytics_access_log (SelectExecutor): MinMax index condition: unknown, (column 0 in [18483, 18483]), and
[bigdata-ck-test001] 2020.08.09 21:45:06.873705 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> ods.analytics_access_log (SelectExecutor): Selected 8 parts by date, 8 parts by key, 49 marks to read from 39 ranges
[bigdata-ck-test001] 2020.08.09 21:45:06.873923 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Trace> ods.analytics_access_log (SelectExecutor): Reading approx. 345563 rows with 8 streams
[bigdata-ck-test001] 2020.08.09 21:45:06.874091 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[bigdata-ck-test001] 2020.08.09 21:45:06.874172 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> executeQuery: Query pipeline:
Union
 Expression × 8
  Expression
   MergeTreeThread

[bigdata-ck-test001] 2020.08.09 21:45:06.879051 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
[bigdata-ck-test001] 2020.08.09 21:45:06.879070 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Trace> UnionBlockInputStream: Waited for threads to finish
[bigdata-ck-test001] 2020.08.09 21:45:06.879174 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Information> executeQuery: Read 332795 rows, 8.67 MiB in 0.006 sec., 53309149 rows/sec., 1.36 GiB/sec.
[bigdata-ck-test001] 2020.08.09 21:45:06.879199 {40825efb-8afe-4e67-b443-585e77da49d6} [ 141 ] <Debug> MemoryTracker: Peak memory usage (for query): 7.41 MiB.

其中,ts_date为分区(PARTITION BY)列,site_id为索引(ORDER BY)列。下面我们逐行分析之。

  • condition "(site_id = 10037) AND (ts_date = '2020-08-09')" moved to PREWHERE
    这一行表示两个WHERE谓词都被优化成了PREWHERE。PREWHERE是MergeTree引擎族特有的优化逻辑,即先根据指定的列进行数据过滤,过滤完之后再取被查询列的数据,可以大大提高效率。默认不需要我们手写PREWHERE逻辑,因为optimize_move_to_prewhere参数默认是开启的,可以由ClickHouse自动优化。

  • Key condition: (column 1 in [10037, 10037]), unknown
    这一行表示查询时使用了索引列site_id,范围为[10037, 10037]。

  • MinMax index condition: unknown, (column 0 in [18483, 18483])
    这一行表示查询时使用了分区列ts_date(注意分区列本身是有minmax索引的),范围为[18483, 18483]。为什么是这个数呢?因为Date类型在ClickHouse中存储时,是以距离1970年1月1日的天数来记录的。

  • Selected 8 parts by date, 8 parts by key, 49 marks to read from 39 ranges
    这一行表示查询时需要扫描多少个数据分块(parts),多少个稀疏索引的标记(marks)以及索引范围(ranges)。

  • Reading approx. 345563 rows with 8 streams
    这一行表示需要扫描大约多少行数据,以及所使用的线程数。

  • Query pipeline:
    以下这些是查询计划的流水线,× 8表示8个线程并行处理。Expression似乎有些语焉不详,其实下面的Expression表示WHERE子句,而上面的Expression表示SELECT子句。

Union
 Expression × 8
  Expression
   MergeTreeThread
  • Read 332795 rows, 8.67 MiB in 0.006 sec., 53309149 rows/sec., 1.36 GiB/sec.
  • Peak memory usage (for query): 7.41 MiB.
    这两行表示实际执行查询时读取的数据行数、数据大小、速率,以及内存用量峰值。

由此可见,ClickHouse日志中的查询计划还是能够给我们提供一些优化SQL的思路的,尤其是索引和扫描数据量的统计,非常直接。

下面再来举一个两表关联的示例。

~ clickhouse-client -h localhost --port 9070 --password ck2020 --send_logs_level=trace <<< "
SELECT a.column_type,a.category_id,a.merchandise_id,b.price
FROM ods.analytics_access_log a
LEFT JOIN ods.ms_order_done b
ON a.ts_date = b.ts_date AND a.merchandise_id = b.merchandise_id
WHERE a.ts_date = '2020-08-09' AND a.site_id = 10037 AND b.rebate_amount > 0
" > /dev/null

[bigdata-ck-test001] 2020.08.09 22:07:25.906404 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> executeQuery: (from 127.0.0.1:55676) SELECT a.column_type, a.category_id, a.merchandise_id, b.price FROM ods.analytics_access_log AS a LEFT JOIN ods.ms_order_done AS b ON (a.ts_date = b.ts_date) AND (a.merchandise_id = b.merchandise_id) WHERE (a.ts_date = '2020-08-09') AND (a.site_id = 10037) AND (b.rebate_amount > 0)
[bigdata-ck-test001] 2020.08.09 22:07:25.906737 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "(site_id = 10037) AND (ts_date = '2020-08-09')" moved to PREWHERE
[bigdata-ck-test001] 2020.08.09 22:07:25.907057 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> Join: setSampleBlock: rebate_amount Int64 Int64(size = 0), price Int64 Int64(size = 0), b.merchandise_id Int64 Int64(size = 0), b.ts_date Date UInt16(size = 0)
[bigdata-ck-test001] 2020.08.09 22:07:25.907641 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> Join: setSampleBlock: rebate_amount Int64 Int64(size = 0), price Int64 Int64(size = 0), b.merchandise_id Int64 Int64(size = 0), b.ts_date Date UInt16(size = 0)
[bigdata-ck-test001] 2020.08.09 22:07:25.908164 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> ods.analytics_access_log (SelectExecutor): Key condition: (column 1 in [10037, 10037]), unknown, and, unknown, and, (column 1 in [10037, 10037]), unknown, and, and
[bigdata-ck-test001] 2020.08.09 22:07:25.908182 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> ods.analytics_access_log (SelectExecutor): MinMax index condition: unknown, (column 0 in [18483, 18483]), and, unknown, and, unknown, (column 0 in [18483, 18483]), and, and
[bigdata-ck-test001] 2020.08.09 22:07:25.908328 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> ods.analytics_access_log (SelectExecutor): Selected 6 parts by date, 6 parts by key, 51 marks to read from 40 ranges
[bigdata-ck-test001] 2020.08.09 22:07:25.908551 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> ods.analytics_access_log (SelectExecutor): Reading approx. 359443 rows with 6 streams
[bigdata-ck-test001] 2020.08.09 22:07:25.908711 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[bigdata-ck-test001] 2020.08.09 22:07:25.908920 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> executeQuery: Query pipeline:
Expression
 CreatingSets
  Lazy
  Union
   Expression × 6
    Filter
     Expression
      MergeTreeThread

[bigdata-ck-test001] 2020.08.09 22:07:25.909040 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Trace> CreatingSetsBlockInputStream: Creating join.
[bigdata-ck-test001] 2020.08.09 22:07:25.909162 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Debug> ods.ms_order_done (SelectExecutor): Key condition: unknown
[bigdata-ck-test001] 2020.08.09 22:07:25.909172 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Debug> ods.ms_order_done (SelectExecutor): MinMax index condition: unknown
[bigdata-ck-test001] 2020.08.09 22:07:25.909314 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Debug> ods.ms_order_done (SelectExecutor): Selected 86 parts by date, 86 parts by key, 843 marks to read from 86 ranges
[bigdata-ck-test001] 2020.08.09 22:07:25.910543 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Trace> ods.ms_order_done (SelectExecutor): Reading approx. 6395624 rows with 20 streams
[bigdata-ck-test001] 2020.08.09 22:07:25.911586 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[bigdata-ck-test001] 2020.08.09 22:07:26.403970 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 94 ] <Debug> CreatingSetsBlockInputStream: Created. Join with 519299 entries from 6118166 rows. In 0.495 sec.
[bigdata-ck-test001] 2020.08.09 22:07:26.441105 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
[bigdata-ck-test001] 2020.08.09 22:07:26.441143 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> UnionBlockInputStream: Waited for threads to finish
[bigdata-ck-test001] 2020.08.09 22:07:26.441161 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
[bigdata-ck-test001] 2020.08.09 22:07:26.441176 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Trace> UnionBlockInputStream: Waited for threads to finish
[bigdata-ck-test001] 2020.08.09 22:07:26.441308 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Information> executeQuery: Read 6474417 rows, 160.76 MiB in 0.535 sec., 12104900 rows/sec., 300.56 MiB/sec.
[bigdata-ck-test001] 2020.08.09 22:07:26.441329 {c9dd6fb6-6173-455c-a7c2-5839169a7003} [ 164 ] <Debug> MemoryTracker: Peak memory usage (for query): 410.84 MiB.

重复的东西就不再分析了,重点观察ods.ms_order_done (SelectExecutor): Key condition: unknownods.ms_order_done (SelectExecutor): MinMax index condition: unknown及以下的几行日志,说明按照上面SQL语句的写法,JOIN的谓词没有被下推,右表的日期分区和WHERE过滤条件都是全表扫描之后才执行的,非常浪费。

接下来我们手动将谓词以子查询的形式写在里面,再观察一次查询计划。

~ clickhouse-client -h localhost --port 9070 --password ck2020 --send_logs_level=trace <<< "
SELECT a.column_type,a.category_id,a.merchandise_id,b.price
FROM (
  SELECT column_type,category_id,merchandise_id
  FROM ods.analytics_access_log
  WHERE ts_date = '2020-08-09' AND site_id = 10037
) a
LEFT JOIN (
  SELECT merchandise_id,price
  FROM ods.ms_order_done
  WHERE ts_date = '2020-08-09' AND rebate_amount > 0
) b
ON a.merchandise_id = b.merchandise_id
" > /dev/null

[bigdata-ck-test001] 2020.08.09 22:15:49.269429 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Debug> executeQuery: (from 127.0.0.1:55686) SELECT a.column_type, a.category_id, a.merchandise_id, b.price FROM (SELECT column_type, category_id, merchandise_id FROM ods.analytics_access_log WHERE (ts_date = '2020-08-09') AND (site_id = 10037)) AS a LEFT JOIN (SELECT merchandise_id, price FROM ods.ms_order_done WHERE (ts_date = '2020-08-09') AND (rebate_amount > 0)) AS b ON a.merchandise_id = b.merchandise_id
## 略去......
[bigdata-ck-test001] 2020.08.09 22:15:49.272530 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Debug> executeQuery: Query pipeline:
Expression
 CreatingSets
  Lazy
  Union
   Expression × 8
    Expression
     Expression
      Expression
       MergeTreeThread

[bigdata-ck-test001] 2020.08.09 22:15:49.272668 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Trace> CreatingSetsBlockInputStream: Creating join.
[bigdata-ck-test001] 2020.08.09 22:15:49.272942 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Debug> ods.ms_order_done (SelectExecutor): Key condition: unknown, unknown, and, unknown, and
[bigdata-ck-test001] 2020.08.09 22:15:49.272965 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Debug> ods.ms_order_done (SelectExecutor): MinMax index condition: (column 0 in [18483, 18483]), unknown, and, (column 0 in [18483, 18483]), and
[bigdata-ck-test001] 2020.08.09 22:15:49.272990 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Debug> ods.ms_order_done (SelectExecutor): Selected 5 parts by date, 5 parts by key, 23 marks to read from 5 ranges
[bigdata-ck-test001] 2020.08.09 22:15:49.273175 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Trace> ods.ms_order_done (SelectExecutor): Reading approx. 163922 rows with 5 streams
[bigdata-ck-test001] 2020.08.09 22:15:49.273335 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Trace> InterpreterSelectQuery: FetchColumns -> Complete
[bigdata-ck-test001] 2020.08.09 22:15:49.280033 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 51 ] <Debug> CreatingSetsBlockInputStream: Created. Join with 10788 entries from 143210 rows. In 0.007 sec.
[bigdata-ck-test001] 2020.08.09 22:15:49.305967 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
[bigdata-ck-test001] 2020.08.09 22:15:49.305990 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Trace> UnionBlockInputStream: Waited for threads to finish
[bigdata-ck-test001] 2020.08.09 22:15:49.306014 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Trace> UnionBlockInputStream: Waiting for threads to finish
[bigdata-ck-test001] 2020.08.09 22:15:49.306023 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Trace> UnionBlockInputStream: Waited for threads to finish
[bigdata-ck-test001] 2020.08.09 22:15:49.306143 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Information> executeQuery: Read 510445 rows, 12.52 MiB in 0.037 sec., 13918334 rows/sec., 341.43 MiB/sec.
[bigdata-ck-test001] 2020.08.09 22:15:49.306167 {3ebb7bc0-0a5b-49b7-87a2-7972a566a22e} [ 127 ] <Debug> MemoryTracker: Peak memory usage (for query): 58.70 MiB.

可见,通过改变SQL的写法,右表的数据在关联之前就得到了分区过滤与谓词过滤,处理数据量和内存占用都大大减少了。由此也可看出,ClickHouse的SQL优化器是比较弱的,需要我们多加注意。

最后,由于必须要实际执行查询才能输出日志,所以当预期的查询结果集比较大时,需要加上LIMIT子句来限制返回的数据量。

明天早起搬砖,民那晚安晚安。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 215,923评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,154评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 161,775评论 0 351
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,960评论 1 290
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,976评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,972评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,893评论 3 416
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,709评论 0 271
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,159评论 1 308
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,400评论 2 331
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,552评论 1 346
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,265评论 5 341
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,876评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,528评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,701评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,552评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,451评论 2 352