性能衡量方法
mysql
explain:执行SQL语句性能测试;
mysqldumpslow:执行慢查询分析;
详情参考:https://blog.csdn.net/zhiyuan411/article/details/6892164
SQL性能优化实例
PG库的join操作案例
查询店铺“zhengzhou_4591”对应的有效团品的五折大促的状态。
以下两条SQL的执行结果完全一致,但是通过查询执行计划,SQL1的查询时间在3100ms上下,而SQL2的执行时间在300ms上下,执行效率相差10倍。
SQL1:
SELECT
t.id, t2.seq, f. STATUS
FROM (
SELECT DISTINCT id, partner_id
FROM team
WHERE publish = 'Y'
AND expire_time >= extract(epoch FROM date 'today' AT time zone '0') :: BIGINT
AND begin_time < extract('epoch' FROM CURRENT_TIMESTAMP) :: BIGINT
AND end_time >= extract('epoch' FROM date 'today' AT time zone '0') :: BIGINT
AND string_to_array (new_type, ',') && ARRAY [ 'hms', 'normal', 'purchase_option' ]
AND group_id = 4
AND p_team_id <= 0
AND new_type = 'normal'
AND delivery = 'through_coupon'
AND close_time = 0
) AS t
JOIN through_shop ts ON t.id = ts.team_id
JOIN (SELECT * FROM partner_shop ps WHERE ps.seq = 'zhengzhou_4591') t2 ON t2.id = ts.shop_id
JOIN five_discount_hotel f ON t2.seq = f.seq
LIMIT 100 OFFSET 0
SQL2:
SELECT
t.id, t2.seq, f. STATUS
FROM (SELECT * FROM partner_shop ps WHERE ps.seq = 'zhengzhou_4591') t2
JOIN through_shop ts ON t2.id = ts.shop_id
JOIN team t ON t.id = ts.team_id
JOIN five_discount_hotel f ON t2.seq = f.seq
WHERE publish = 'Y'
AND expire_time >= extract(epoch FROM date 'today' AT time zone '0') :: BIGINT
AND begin_time < extract('epoch' FROM CURRENT_TIMESTAMP) :: BIGINT
AND end_time >= extract(epoch FROM date 'today' AT time zone '0') :: BIGINT
AND string_to_array (new_type, ',') && ARRAY [ 'hms', 'normal', 'purchase_option' ]
AND group_id = 4
AND p_team_id <= 0
AND new_type = 'normal'
AND delivery = 'through_coupon'
AND close_time = 0
LIMIT 100
差异分析
所有的过滤条件都是一致的并且都有效命中了索引,那么,时间差主要出现在了join的顺序上。
原因详细分析
在SQL1中,我们首先获取了在team表中存储的大部分的team_id和partner_id,总共有115220条记录;
在SQL2中,我们根据已知的hotel_seq字段,获得了8条数据。
总结
尽早减少中间结果集合的条数,以提高之后的join的效率。
所有SQL在执行时一定要在第一时间尽量减少扫描的行数,不管是索引扫描还是全表扫描,一定要保证数据库在执行第一个condition的时候能够最大限度的过滤掉不相关的记录,减少后续的扫描和计算消耗。
注:实际应用中,PG对于join得操作会有一定的优化,对于condition的执行顺序也都会有优化。但根据上面的实际结果仍然可以看到执行效率的差距巨大。