performance_schema 在实际MySQL管理中有很多用途,第一篇先来看如何利用等待事件排查MySQL性能问题,以及如何查看多线程复制报错详情。
一、 利用等待事件排查MySQL性能问题
以下我们以sysbench压测MySQL数据库为例,介绍如何使用performance_schema的等待事件来排查数据库性能瓶颈。
首先,使用performance_schema配置表启用等待事件的采集与记录
-- 启用所有的等待事件的instruments
use performance_schema
Database changed
-- 修改setup_instruments 表的enabled和timed字段为yes,表示启用对应的instruments
update setup_instruments set enabled='yes',timed='yes' where name like 'wait/%';
Query OK, 269 rows affected (0.00 sec)
Rows matched: 323 Changed: 269 Warnings: 0
-- 查看修改结果,enabled和timed字段为YES即表示当前instruments已经启用,但此时采集器并不会立即采集事件数据,需要对应consumers启用后才会开始采集
select * from setup_instruments where name like 'wait/%';
+--------------------------------------------------------------------+---------+-------+
| NAME | ENABLED | TIMED |
+--------------------------------------------------------------------+---------+-------+
| wait/synch/mutex/sql/TC_LOG_MMAP::LOCK_tc | YES | YES |
| wait/synch/mutex/sql/LOCK_des_key_file | YES | YES |
............
| wait/io/socket/sql/server_tcpip_socket | YES | YES |
| wait/io/socket/sql/server_unix_socket | YES | YES |
| wait/io/socket/sql/client_connection | YES | YES |
| wait/lock/metadata/sql/mdl | YES | YES |
+--------------------------------------------------------------------+---------+-------+
323 rows in set (0.01 sec)
-- 启用等待事件的consumers
update setup_consumers set enabled='yes' where name like '%wait%';
Query OK, 3 rows affected (0.00 sec)
Rows matched: 3 Changed: 3 Warnings: 0
select * from setup_consumers where name like '%wait%';
+---------------------------+---------+
| NAME | ENABLED |
+---------------------------+---------+
| events_waits_current | YES |
| events_waits_history | YES |
| events_waits_history_long | YES |
+---------------------------+---------+
3 rows in set (0.00 sec)
然后,使用sysbench对数据库执行加压,并逐渐增加并发线程数,直到tps、qps不再随着线程数的增加而增加为止。
sysbench --test=oltp --db-driver=mysql --mysql-table-engine=innodb --mysql-host=10.10.10.10 --mysql-port=3306 --mysql-db=sbtest --mysql-user='qbench' --mysql-password='qbench' --test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=5000000 --oltp-tables-count=8 --num-threads=16 --max-time=1800 --max-requests=0 --report-interval=1 run
............
[ 111s] threads: 16, tps: 52.99, reads/s: 668.93, writes/s: 171.98, response time: 629.52ms (95%)
[ 112s] threads: 16, tps: 42.00, reads/s: 650.93, writes/s: 202.98, response time: 688.46ms (95%)
............
从sysbench的输出结果中,我们可以看到在16个并发线程oltp压力下,tps只能跑到100不到,且延迟在600ms+,说明存在严重的性能瓶颈(MySQL内部发生了严重的互斥等待或者硬件设备严重的性能不足)。
先使用操作系统命令查看硬件负载情况。
# top命令查看到CPU资源绝大部分都消耗在了%wa上,说明IO设备性能出现严重不足
top - 18:59:03 up 7:02, 3 users, load average: 4.28, 5.82, 4.22
Tasks: 186 total, 1 running, 185 sleeping, 0 stopped, 0 zombie
Cpu0 : 4.1%us, 8.5%sy, 0.0%ni, 11.9%id, 75.4%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 4.0%us, 13.1%sy, 0.0%ni, 17.5%id, 65.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu2 : 9.4%us, 32.1%sy, 0.0%ni, 2.3%id, 55.5%wa, 0.0%hi, 0.7%si, 0.0%st
Cpu3 : 3.0%us, 5.3%sy, 0.0%ni, 31.0%id, 60.0%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 8053664k total, 1684236k used, 6369428k free, 87868k buffers
Swap: 2031612k total, 0k used, 2031612k free, 150680k cached
# iostat命令查看磁盘负载,通过%util列可以看到,磁盘处于100%满负载状态
avg-cpu: %user %nice %system %iowait %steal %idle
1.77 0.00 2.28 95.70 0.00 0.25
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-2 0.00 0.00 277.00 160.00 8864.00 2774.00 26.63 47.84 112.98 2.29 100.10
avg-cpu: %user %nice %system %iowait %steal %idle
5.05 0.00 11.62 64.14 0.00 19.19
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-2 0.00 0.00 267.00 244.00 8544.00 4643.00 25.81 28.20 40.29 1.96 100.00
通过查询系统负载,一眼就可以看出来是由于磁盘性能严重不足导致的,但是,在数据库内部的事件信息是如何体现的呢(注意:如果你没有足够的performance_schema使用经验,此时是绝好的学习积累的机会,不要错过,也许哪一天操作系统负载并不能看出来端倪的时候,这些事件信息能帮上大忙)
-- 如果故障当前还在发生,查询events_waits_current表,可以看到下面top等待事件均为IO相关
select THREAD_ID,EVENT_NAME,sys.format_time(TIMER_WAIT),INDEX_NAME,NESTING_EVENT_TYPE,OPERATION,NUMBER_OF_BYTES from events_waits_current where EVENT_NAME!='idle' order by TIMER_WAIT desc;
+-----------+------------------------------------------------+-----------------------------+------------+--------------------+------------+-----------------+
| THREAD_ID | EVENT_NAME | sys.format_time(TIMER_WAIT) | INDEX_NAME | NESTING_EVENT_TYPE | OPERATION | NUMBER_OF_BYTES |
+-----------+------------------------------------------------+-----------------------------+------------+--------------------+------------+-----------------+
| 115 | wait/io/table/sql/handler | 169.48 ms | PRIMARY | STATEMENT | fetch | 39 |
| 115 | wait/io/file/innodb/innodb_data_file | 169.48 ms | NULL | WAIT | read | 16384 |
| 101 | wait/io/table/sql/handler | 93.76 ms | PRIMARY | STATEMENT | fetch | 39 |
| 101 | wait/io/file/innodb/innodb_data_file | 93.76 ms | NULL | WAIT | read | 16384 |
| 111 | wait/io/file/innodb/innodb_data_file | 73.08 ms | NULL | STATEMENT | read | 16384 |
| 103 | wait/io/file/innodb/innodb_data_file | 63.13 ms | NULL | STATEMENT | read | 16384 |
| 106 | wait/io/file/innodb/innodb_data_file | 53.24 ms | NULL | STATEMENT | read | 16384 |
| 113 | wait/io/table/sql/handler | 51.90 ms | PRIMARY | STATEMENT | fetch | 39 |
| 113 | wait/io/file/innodb/innodb_data_file | 51.90 ms | NULL | WAIT | read | 16384 |
| 49 | wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond | 27.48 ms | NULL | STATEMENT | timed_wait | NULL |
............
57 rows in set (0.00 sec)
也可以查询汇总值
-- 为方便查询等待事件统计,可以先创建一个视图,用于实时统计当前等待事件(非历史数据)
create view sys.test_waits as select sum(TIMER_WAIT) as TIMER_WAIT,sum(NUMBER_OF_BYTES) as NUMBER_OF_BYTES, EVENT_NAME,OPERATION from events_waits_current where EVENT_NAME!='idle' group by EVENT_NAME,OPERATION;
Query OK, 0 rows affected (0.04 sec)
-- 使用前面创建的视图进行查询,可以看到时间开销排名前5的有4个都是与IO相关的等待,剩下1个是binlog相关的互斥等待
select sys.format_time(TIMER_WAIT),sys.format_bytes(NUMBER_OF_BYTES),EVENT_NAME,OPERATION from sys.test_waits where sys.format_time(TIMER_WAIT) not regexp 'ns|us' order by TIMER_WAIT desc;
+-----------------------------+-----------------------------------+------------------------------------------------+------------+
| sys.format_time(TIMER_WAIT) | sys.format_bytes(NUMBER_OF_BYTES) | EVENT_NAME | OPERATION |
+-----------------------------+-----------------------------------+------------------------------------------------+------------+
| 16.60 s | 224.00 KiB | wait/io/file/innodb/innodb_data_file | read |
| 16.05 s | 553 bytes | wait/io/table/sql/handler | fetch |
| 1.96 s | NULL | wait/io/file/sql/binlog | sync |
| 1.96 s | NULL | wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond | timed_wait |
| 1.85 s | 1.34 KiB | wait/io/file/sql/binlog | write |
| 56.66 ms | NULL | wait/io/file/innodb/innodb_log_file | sync |
+-----------------------------+-----------------------------------+------------------------------------------------+------------+
6 rows in set (0.01 sec)
从上述等待事件的查询结果中,可以非常清晰地看到,事务大多数的延迟时间花在了等待IO上,说明IO设备可能出现了严重的性能瓶颈,这里与操作系统命令查看到的磁盘性能严重不足相对应。
通过以上测试数据表明,MySQL的性能严重低下的原因是磁盘性能严重不足成为了瓶颈(一般情况下,4c的cpu在内存和磁盘不构成瓶颈的情况下可以达到800+ tps才可能会构成瓶颈)
针对IO性能不足,建议优化策略:
- 更换IO性能更好的设备
- 把MySQL 中的redo log、binlog、其他data file分别放在3个独立的IO设备上,以便数据库中的随机IO和顺序IO不会因为争抢资源而导致IO等待
二、 查看多线程复制报错详情
官方MySQL 从5.6版本开始支持基于库级别的并行复制,在MySQL 5.7版本中支持基于事务的并行复制。在启用了并行复制之后,如果发生复制报错,通过show slave status语句无法查看到报错详情(show slave status语句只能查看到SQL线程的报错信息,而在多线程复制下,SQL线程的报错信息是根据worker线程的报错信息的一个汇总),类似:
admin@localhost : (none) 12:45:19> show slave status\G;
............
Last_Errno: 1062
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '23fb5832-e4bc-11e7-8ea4-525400a4b2e1:2553990' at master log mysql-bin.000034, end_log_pos 98797. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
............
Last_SQL_Errno: 1062
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '23fb5832-e4bc-11e7-8ea4-525400a4b2e1:2553990' at master log mysql-bin.000034, end_log_pos 98797. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
............
1 row in set (0.00 sec)
根据报错提示查看performance_schema.replication_applier_status_by_worker表,该表中详细记录了每一个worker线程的详细信息,可以找到发生报错的worker线程具体的报错原因:
select * from performance_schema.replication_applier_status_by_worker where LAST_ERROR_MESSAGE!=''\G;
*************************** 1. row ***************************
CHANNEL_NAME:
WORKER_ID: 2
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: 23fb5832-e4bc-11e7-8ea4-525400a4b2e1:2553991
LAST_ERROR_NUMBER: 1062
LAST_ERROR_MESSAGE: Worker 2 failed executing transaction '23fb5832-e4bc-11e7-8ea4-525400a4b2e1:2553991' at master log mysql-bin.000034, end_log_pos 99514; Could not execute Write_rows event on table sbtest.sbtest4; Duplicate entry '833353' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 99514
LAST_ERROR_TIMESTAMP: 2018-01-02 14:08:58
1 row in set (0.00 sec)
这里我们可以看到真正的报错是,Duplicate entry '833353' for key 'PRIMARY', Error_code: 1062; 即主键冲突。
由于历史原因,performance_schema中的复制信息记录表只记录跟GTID相关的信息,而在mysql系统字典库下的slave_master_info、slave_relay_log_info、slave_worker_info表记录binlog position相关信息。另外,如果选择相关的复制信息记录到文件,那么磁盘上还存在着master.info、relay_log.info等文件记录binlog position相关的信息。
参考
《MySQL 性能优化金字塔法则》