MySQL索引优化实战

索引的种类

众所周知，索引类似于字典的目录，可以提高查询的效率。
索引从物理上可以分为：聚集索引，非聚集索引
从逻辑上可以分为：普通索引，唯一索引，主键索引，联合索引，全文索引

索引优化策略

不要在索引列上进行运算或使用函数

在列上进行运算或使用函数会使索引失效，从而进行全表扫描。如下面例子在publish_time，id列上分别加上索引，publish_time为datetime类型，id为int类型

-- 全表扫描
select * from article where year(publish_time) < 2019

-- 走索引
select * from article where publish_time < '2019-01-01'

-- 全表扫描
select * from article where id  + 1 = 5

-- 走索引
select * from article where id = 4

小心隐式类型转换

假设id为varchar类型

-- 全表扫描
select * from article where id = 100

-- 走索引
select * from article where id = '100'

为什么呢？

select * from article where id = 100
-- 等价于
select * from article where CAST(id AS signed int) = 100

上一条规则说过，不要在索引列上使用函数，隐式类型转换在索引字段上做了函数操作，因此会全表扫描

那么如果id是int，执行下面这个语句是否会导致全表扫描呢？

select * from article where id = '100'

答案是会用到索引

前导模糊查询不会使用索引

-- 全表扫描
select * from article where author like '%李'

%李,%李%都会导致全表扫描，非前导模糊查询可以使用索引

-- 走索引
select * from article where author like '李%'

联合索引最左前缀原则

mysql会一直向右匹配直到遇到范围查询(>、<、between、like)就停止匹配，比如a = 1 and b = 2 and c > 3 and d = 4 如果建立(a,b,c,d)顺序的索引，d是用不到索引的，如果建立(a,b,d,c)的索引则都可以用到，a,b,d的顺序可以任意调整

1.将区分度最高的字段放在最左边

当不需要考虑排序和分组时，将区分度最高的列放在前面通常是很好的。这时候索引的作用只是用于优化WHERE条件的查找

如果在a b列上建立联合索引，该如何建立，才能使查询效率最高

select count(distinct a) / count(*), count(distinct b) / count(*), count(*) from table

执行如下语句，假设3个输出依次为0.0001,0.373,16049，可以看到b列的选择性最高，因此将其作为联合索引的第一列，即建立(b, a)的联合索引

2.查询时=可以乱序
如果建立了联合索引（a, b）。例如下面的2个写法是等价的，因为MySQL会将查询的顺序优化成和联合索引的顺序一致

select * from table where a = '1' and b = '1'

select * from table where b = '1' and a = '1'

放几个例子

-- 使用了a列
where a = 3

-- 使用了a b列
where a = 3 and b = 5

-- 使用了a b c列
where a = 3 and c = 4 and b = 5
 
-- 没有使用索引
where b = 3

-- 使用了a列 
where a = 3 and c = 4

-- 使用了a b列 
where a = 3 and b > 10 and c = 7
 
-- 使用了a b 列
where a = 3 and b like 'xx%' and c = 7

union，or，in都能命中索引，建议使用in

select * from article where id = 1
union all
select * from article where id = 2

select * from article where id in (1 , 2)

新版MySQL的or可以命中索引

select * from article where id = 1 or id = 2

效率从高到低为union，in，or。in和union的效率差别可以忽略不计，建议使用in

负向条件索引不会使用索引，建议用in

负向条件有：!=、<>、not in、not exists、not like 等

-- 全表扫描
select * from article where id != 1 and id != 2

知道id的所有取值范围，可以改为类似如下形式

-- 走索引
select * from article where id in (0, 3, 4)

建立覆盖索引

众所周知，表数据是放在一个聚集索引上的，而建立的索引为非聚集索引，非聚集索引的叶子节点存放索引键值，以及该索引键指向的主键。一般查找的过程是从非聚集索引上找到数据的主键，然后根据该主键到聚集索引上查找记录，这个过程称为回表，不清楚的看推荐阅读。

如有下面这个sql

select uid, login_time from user where username = ? and passwd = ?

可以建立(username, passwd, login_time)的联合索引，由于 login_time的值可以直接从索引中拿到，不用再回表查询，提高了查询效率

经常更改，区分度不高的列上不宜加索引

更新会变更 B+ 树，更新频繁的字段建立索引会大大降低数据库性能。

“性别”这种区分度不大的属性，建立索引是没有什么意义的，不能有效过滤数据，性能与全表扫描类似。

一般区分度在80%以上的时候就可以建立索引，区分度可以使用 count(distinct(列名))/count(*) 来计算

明确知道只会返回一条记录，可以加limit1

当查询确定只有一条记录时，可以加liimit1，让MySQL停止游标移动，提高查询效率

select uid from user where username = ? and passwd = ?

可改为

select uid from user where username = ? and passwd = ? limit 1

参考博客

[1]https://blog.csdn.net/qq_21987433/article/details/79753551