前言

《高性能MySQL》里面提及用in这种方式可以有效的替代一定的range查询,提升查询效率, 因为在一条索引里面,range字段后面的部分是不生效的(ps.需要考虑 ICP) 。MySQL优化器将in这种方式转化成 n*m 种组合进行查询,最终将返回值合并,有点类似union但是更高效。

MySQL在 IN() 组合条件过多的时候会发生很多问题。查询优化可能需要花很多时间,并消耗大量内存。新版本MySQL在组合数超过一定的数量就不进行计划评估了,这可能导致MySQL不能很好的利用索引。

这里的 一定数 在MySQL5.6.5以及以后的版本中是由eq_range_index_dive_limit这个参数控制 。默认设置是10,一直到5.7以后的版本默认修改为200,当然可以手动设置的。5.6手册说明如下:

The eq_range_index_dive_limit system variable enables you to configure the number of values at which the optimizer switches from one row estimation strategy to the other. To disable use of statistics and always use index dives, set eq_range_index_dive_limit to 0. To permit use of index dives for comparisons of up to N equality ranges, set eq_range_index_dive_limit to N + 1. eq_range_index_dive_limit is available as of MySQL 5.6.5. Before 5.6.5, the optimizer uses index dives, which is equivalent to eq_range_index_dive_limit=0.

换言之,

eq_range_index_dive_limit = 0 只能使用index dive

0 < eq_range_index_dive_limit <= N 使用index statistics

eq_range_index_dive_limit > N 只能使用index dive

在MySQL5.7版本中将默认值从10修改成200目的是为了尽可能的保证范围等值运算(IN())执行计划尽量精准,因为IN()list的数量很多时候都是超过10的。

在MySQL的官方手册上有这么一句话:

the optimizer can estimate the row count for each range using dives into the index or index statistics.

大意:

优化器预估每个范围段--如"a IN (10, 20, 30)" 视为等值比较, 括3个范围段实则简化为3个单值,分别是10,20,30--中包括的元组数,用范围段来表示是因为 MySQL 的"range"扫描方式多数做的是范围扫描,此处单值可视为范围段的特例;

估计方法有2种:

  1. dive到index中即利用索引完成元组数的估算,简称index dive;
  2. index statistics:使用索引的统计数值,进行估算;

对比这两种方式

  1. index dive: 速度慢,但能得到精确的值(MySQL的实现是数索引对应的索引项个数,所以精确)
  2. index statistics: 速度快,但得到的值未必精确

简单说,**选项 eq_range_index_dive_limit 的值设定了 IN列表中的条件个数上线,超过设定值时,会将执行计划从 index dive 变成 index statistics **。

为什么要区分这2种方式呢?

  1. 查询优化器会使用代价估算模型计算每个计划的代价,选择其中代价最小的
  2. 单表扫描时,需要计算代价;所以单表的索引扫描也需要计算代价
  3. 单表的计算公式通常是: 代价 = 元组数 * IO平均值
  4. 所以不管是哪种扫描方式,都需要计算元组数
  5. 当遇到“a IN (10, 20, 30)”这样的表达式的时候,发现a列存在索引,则需要看这个索引可以扫描到的元组数由多少而计算其索引扫描代价,所以就用到了本文提到的“index dive”、“index statistics”这2种方式。

讨论主题

  1. range查询与索引使用
  2. eq_range_index_dive_limit的说明

range查询与索引使用

SQL如下:

SELECT * FROM pre_forum_post WHERE tid=7932552 AND invisible IN('0','-2') ORDER BY dateline DESC LIMIT 10;
PRIMARY(tid,position),pid(pid),fid(tid),displayorder(tid,invisible,dateline)first(tid,first)new_auth(authorid,invisible,tid)idx_dt(dateline)mul_test(tid,invisible,dateline,pid)
root@localhost 16:08:27 [ultrax]> explain SELECT * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2')  -> ORDER BY dateline DESC LIMIT 10;+----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+| id | select_type | table | type | possible_keys  | key | key_len | ref | rows | Extra   |+----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+| 1 | SIMPLE | pre_forum_post | range | PRIMARY,displayorder,first,mul_test,idx_1 | displayorder | 4 | NULL | 54 | Using index condition; Using filesort | +----+-------------+----------------+-------+-------------------------------------------+--------------+---------+------+------+---------------------------------------+1 row in set (0.00 sec)
root@localhost 16:09:06 [ultrax]> alter table pre_forum_post add index idx_1 (tid,dateline); Query OK, 20374596 rows affected, 0 warning (600.23 sec)Records: 0 Duplicates: 0 Warnings: 0root@localhost 16:20:22 [ultrax]> explain SELECT * FROM pre_forum_post force index (idx_1) WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;+----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+| 1 | SIMPLE | pre_forum_post | ref | idx_1 | idx_1 | 3 | const | 120646 | Using where | +----+-------------+----------------+------+---------------+-------+---------+-------+--------+-------------+1 row in set (0.00 sec)root@localhost 16:22:06 [ultrax]> SELECT sql_no_cache * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;...10 rows in set (0.40 sec)root@localhost 16:23:55 [ultrax]> SELECT sql_no_cache * FROM pre_forum_post force index (idx_1) WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;...10 rows in set (0.00 sec)

总结下:在MySQL查询里面使用in(),除了要注意in()list的数量以及eq_range_index_dive_limit的值以外(具体见下),还要注意如果SQL包含排序/分组/去重等等就需要注意索引的使用。

eq_range_index_dive_limit的说明

还是上面的案例,为什么idx_1无法直接使用?需要使用hint强制只用这个索引呢?这里我们首先看下eq_range_index_dive_limit的值。

root@localhost 22:38:05 [ultrax]> show variables like 'eq_range_index_dive_limit';+---------------------------+-------+| Variable_name | Value |+---------------------------+-------+| eq_range_index_dive_limit | 2 | +---------------------------+-------+1 row in set (0.00 sec)
{ "index": "displayorder", "ranges": [ "7932552 <= tid <= 7932552 AND -2 <= invisible <= -2", "7932552 <= tid <= 7932552 AND 0 <= invisible <= 0" ], "index_dives_for_eq_ranges": false, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 54, "cost": 66.81, "chosen": true}// index dive为false,最终chosen是true...{ "index": "idx_1", "ranges": [ "7932552 <= tid <= 7932552" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 120646, "cost": 144776, "chosen": false, "cause": "cost"}
root@localhost 22:52:52 [ultrax]> set eq_range_index_dive_limit = 3;Query OK, 0 rows affected (0.00 sec)root@localhost 22:55:38 [ultrax]> explain SELECT * FROM pre_forum_post WHERE tid=7932552 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 10;+----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+| id | select_type | table | type | possible_keys  | key | key_len | ref | rows | Extra |+----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+| 1 | SIMPLE | pre_forum_post | ref | PRIMARY,displayorder,first,mul_test,idx_1 | idx_1 | 3 | const | 120646 | Using where | +----+-------------+----------------+------+-------------------------------------------+-------+---------+-------+--------+-------------+1 row in set (0.00 sec)
{ "index": "displayorder", "ranges": [ "7932552 <= tid <= 7932552 AND -2 <= invisible <= -2", "7932552 <= tid <= 7932552 AND 0 <= invisible <= 0" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 188193, "cost": 225834, "chosen": true}...{ "index": "idx_1", "ranges": [ "7932552 <= tid <= 7932552" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 120646, "cost": 144776, "chosen": true}... "cost_for_plan": 144775, "rows_for_plan": 120646, "chosen": true

index dive

+----------------------+----------+| Status | Duration |+----------------------+----------+| starting | 0.000048 | | checking permissions | 0.000004 | | Opening tables | 0.000015 | | init  | 0.000044 | | System lock | 0.000009 | | optimizing | 0.000014 | | statistics | 0.032089 | | preparing | 0.000022 | | Sorting result | 0.000003 | | executing | 0.000003 | | Sending data | 0.000101 | | end  | 0.000004 | | query end | 0.000002 | | closing tables | 0.000009 | | freeing items | 0.000013 | | cleaning up | 0.000012 | +----------------------+----------+
+----------------------+----------+| Status | Duration |+----------------------+----------+| starting | 0.000045 | | checking permissions | 0.000003 | | Opening tables | 0.000014 | | init  | 0.000040 | | System lock | 0.000008 | | optimizing | 0.000014 | | statistics | 0.000086 | | preparing | 0.000016 | | Sorting result | 0.000002 | | executing | 0.000002 | | Sending data | 0.000016 | | Creating sort index | 0.412123 | | end  | 0.000012 | | query end | 0.000004 | | closing tables | 0.000013 | | freeing items | 0.000023 | | cleaning up | 0.000015 | +----------------------+----------+

附:

如何使用optimize_trace

set optimizer_trace='enabled=on';select * from information_schema.optimizer_trace\G

如何使用profile

set profiling=ON;执行sql;show profiles;show profile for query 2;show profile block io,cpu for query 2;

参考资料

[1]MySQL SQL优化系列之 in与range 查询

https://www.jb51.net/article/201251.htm

[2]MySQL物理查询优化技术---index dive辨析

http://blog.163.com/li_hx/blog/static/18399141320147521735442/

更多相关文章

  1. MySQL系列多表连接查询92及99语法示例详解教程
  2. Linux下MYSQL 5.7 找回root密码的问题(亲测可用)
  3. MySQL 什么时候使用INNER JOIN 或 LEFT JOIN
  4. Pycharm安装PyQt5的详细教程
  5. 【阿里云镜像】使用阿里巴巴DNS镜像源——DNS配置教程
  6. android用户界面之按钮(Button)教程实例汇
  7. 【Android(安卓)开发教程】Toast通知
  8. Android简易实战教程--第三十九话《Chronometer实现倒计时》
  9. android加密解密完美教程

随机推荐

  1. Android工具箱之Context解析
  2. Android系统的上下文对象Context
  3. android 资源文件String字符串拼接
  4. android仿今日头条App、多种漂亮加载效果
  5. Android样式基础--shape篇
  6. Android发布, Android中国网站正式成立,
  7. Android 应用程序基础
  8. 腾讯面试官:Binder的系统服务是如何获取的
  9. Android(安卓)Dalvik
  10. Android X86