近期收到服务器报警,发现有条sql在执行的时候,耗时很长竟然能有30s多。
数据库版本5.6.28 。sql 如下:
-- focus news 库
SELECT*FROM news_basic WHERE id !=18762279 AND
(publish_time< '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id< 18762279))
AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380
AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time DESC,id DESC LIMIT 1)
UNION ALL (
SELECT*FROM news_basic WHERE id !=18762279 AND
(publish_time> '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id> 18762279))
AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380
AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time ASC,id ASC LIMIT 1);
这条sql 功能是找到id=18762279的文章的相邻的上下篇文章各一篇。其中2017-06-02 09:31:25是id=18762279文章的发布时间。
这条sql看起来很复杂,怕出现慢查询所以explain看了下执行计划。
如图可见,索引走的是城市+类别(index_city_id_category)的联合索引。
那根据索引坐下group by,看下每个小类别有多少条数据。
SELECT city_id ,count(*) from news_basic WHERE category IN (43,52,1201,1202,1203,1204) GROUP BY city_id
可以看出,热门城市文章数较多,检索可能会较慢,但万条数据还算ok。city_id=380的城市应该是个冷门城市,小分类下只有一百条。那为什么sql为什么执行了那么长时间呢??
30s上下的执行时间应该是扫描了百万级数据。
我开始怀疑mysql 的explain在“骗人”。
mysql根本没走指定的索引!
那我对union的两个子句坐下explain怎么样。
EXPLAIN
SELECT*FROM news_basic WHERE id !=18762279 AND (publish_time< '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id< 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time DESC,id DESC LIMIT 1;
EXPLAIN
SELECT*FROM news_basic WHERE id !=18762279 AND (publish_time> '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id> 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time ASC,id ASC LIMIT 1;
好嘛,索引都走publish_time,扫描行数到底291万行……
优化方法
优化方法比较常规,可以force index,也可以ignore index(publish_time)。
(
SELECT*FROM news_basic force index(index_city_id_category) WHERE id !=18762279 AND (publish_time< '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id< 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time DESC,id DESC LIMIT 1) UNION ALL (
SELECT*FROM news_basic force index(index_city_id_category) WHERE id !=18762279 AND (publish_time> '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id> 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time ASC,id ASC LIMIT 1);
(
SELECT*FROM news_basic IGNORE INDEX (publish_time) WHERE id !=18762279 AND (publish_time< '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id< 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time DESC,id DESC LIMIT 1) UNION ALL (
SELECT*FROM news_basic IGNORE INDEX (publish_time) WHERE id !=18762279 AND (publish_time> '2017-06-02 09:31:25' OR (publish_time='2017-06-02 09:31:25' AND id> 18762279)) AND STATUS IN (1,7) AND source IN (11,12,13) AND city_id=380 AND category IN (43,52,1201,1202,1203,1204) ORDER BY publish_time ASC,id ASC LIMIT 1);
两种优化的结果都很明显,查询时间达到0.089s和 0.015s。
explain 计划显示都使用了索引index_city_id_category。
最后
目前的困惑在于如果explain给出的结果是有错误的,那之后优化sql语句时可以相信的工具是什么呢……
查了许久,没有google到mysql做union操作时explain结果不正确的相关案例。
但是在git上发现了类似例子:有人分享了索引失效选择time索引的案例,也是扫描了百万行数据。很可惜的是该例子没有做union操作,也无法对比union的explain结果。