1.查看从库的报错信息 show slave status \G
(hxbmysqladmin@localhost) [(none)]> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 103.160.109.66
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 185303708
Relay_Log_File: kfcsfatsdb2-relay-bin.000006
Relay_Log_Pos: 7243050
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'caf16e55-32f0-11e8-8f98-0050560060d9:1332563' at master log mysql-bin.000003, end_log_pos 7243787. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 7243355
Relay_Log_Space: 185304169
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'caf16e55-32f0-11e8-8f98-0050560060d9:1332563' at master log mysql-bin.000003, end_log_pos 7243787. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10966
Master_UUID: caf16e55-32f0-11e8-8f98-0050560060d9
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 180806 08:50:59
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: caf16e55-32f0-11e8-8f98-0050560060d9:126-1658796
Executed_Gtid_Set: 01a12827-32f1-11e8-974f-0050562ae720:1-99,
caf16e55-32f0-11e8-8f98-0050560060d9:1-1332562
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
2.解析主库binlog日志,查看报错事务的详细信息:mysqlbinlog --no-defaults -v -v --base64-output=decode-rows /mysql/mydata/3306/binlog/mysql-bin.000003 | grep -A '10' 7243787
[mysql@kfcsfatsdb1 binlog]$ mysqlbinlog --no-defaults -v -v --base64-output=decode-rows /mysql/mydata/3306/binlog/mysql-bin.000003 | grep -A '10' 7243787
#180730 14:16:50 server id 10966 end_log_pos 7243787 CRC32 0xebf1a3b1 Delete_rows: table id 109 flags: STMT_END_F
### DELETE FROM `fatsys`.`qrtz_scheduler_state`
### WHERE
### @1='UniEAP_Scheduler' /* VARSTRING(360) meta=360 nullable=0 is_null=0 */
### @2='kfcsfatszk21532566880231' /* VARSTRING(600) meta=600 nullable=0 is_null=0 */
# at 7243787
#180730 14:16:50 server id 10966 end_log_pos 7243818 CRC32 0x3c797654 Xid = 299591
COMMIT/*!*/;
# at 7243818
#180730 14:16:54 server id 10966 end_log_pos 7243883 CRC32 0xbb09e745 GTID last_committed=14568 sequence_number=14569 rbr_only=yes
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
SET @@SESSION.GTID_NEXT= 'caf16e55-32f0-11e8-8f98-0050560060d9:1332564'/*!*/;
# at 7243883
#180730 14:16:54 server id 10966 end_log_pos 7243957 CRC32 0xa11d8870 Query thread_id=30 exec_time=0 error_code=0
SET TIMESTAMP=1532931414/*!*/;
BEGIN
[mysql@kfcsfatsdb1 binlog]$
由上面的日志可以看出是一条delete语句引起,主从执行,从库找不到记录
(hxbmysqladmin@localhost) [fatsys]> DESC qrtz_scheduler_state;
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| sched_name | varchar(120) | NO | PRI | NULL | |
| instance_name | varchar(200) | NO | PRI | NULL | |
| last_checkin_time | bigint(13) | NO | | NULL | |
| checkin_interval | bigint(13) | NO | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
SELECT * FROM `fatsys`.`qrtz_scheduler_state` WHERE sched_name= 'UniEAP_Scheduler' AND instance_name='kfcsfatszk21532566880231'
在从库查看,果然没有记录! 将记录手工插入到从库,然后重启slave
INSERT INTO qrtz_scheduler_state VALUES ('UniEAP_Scheduler','kfcsfatszk21532566880231',1532931402891,20000);
(hxbmysqladmin@localhost) [fatsys]> SELECT * FROM `fatsys`.`qrtz_scheduler_state` WHERE sched_name= 'UniEAP_Scheduler' AND instance_name='kfcsfatszk21532566880231';
+------------------+--------------------------+-------------------+------------------+
| sched_name | instance_name | last_checkin_time | checkin_interval |
+------------------+--------------------------+-------------------+------------------+
| UniEAP_Scheduler | kfcsfatszk21532566880231 | 1532931402891 | 20000 |
+------------------+--------------------------+-------------------+------------------+
1 row in set (0.00 sec)
stop slave
start slave
(hxbmysqladmin@localhost) [fatsys]> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 103.160.109.66
Master_User: sync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 185319370
Relay_Log_File: kfcsfatsdb2-relay-bin.000006
Relay_Log_Pos: 9278084
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 9278389
Relay_Log_Space: 185320344
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 576785
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 10966
Master_UUID: caf16e55-32f0-11e8-8f98-0050560060d9
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Waiting for dependent transaction to commit
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: caf16e55-32f0-11e8-8f98-0050560060d9:126-1658823
Executed_Gtid_Set: 01a12827-32f1-11e8-974f-0050562ae720:1-99,
caf16e55-32f0-11e8-8f98-0050560060d9:1-1336537
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
(hxbmysqladmin@localhost) [fatsys]> SELECT * FROM `fatsys`.`qrtz_scheduler_state` WHERE sched_name= 'UniEAP_Scheduler' AND instance_name='kfcsfatszk21532566880231';
Empty set (0.00 sec)
从库记录被删除,日志跳过了!然而同步一会日志又出现报错,解析binlog得到如下结果:
mysqlbinlog --no-defaults --base64-output=decode-rows -v /mysql/mydata/3306/binlog/mysql-bin.000003 | grep -A '10' 18899630
mysql@kfcsfatsdb1 binlog]$ mysqlbinlog --no-defaults --base64-output=decode-rows -v /mysql/mydata/3306/binlog/mysql-bin.000003 | grep -A '10' 18899630
#180731 10:35:11 server id 10966 end_log_pos 18899630 CRC32 0x821bb81e Write_rows: table id 115 flags: STMT_END_F
### INSERT INTO `fatsys`.`t_trans_purchase_record`
### SET
### @1='20291109104309003288'
### @2='20291109100411003287'
### @4='1'
### @5='1'
### @6='6226300119173549'
### @8='623002'
### @9=6700.00
### @10=6600.00
--
# at 18899630
# at 18899755
#180731 10:35:11 server id 10966 end_log_pos 18899929 CRC32 0xaf7f42f6 Table_map: `fatsys`.`t_trans_purchase_record` mapped to number 115
# at 18899929
#180731 10:35:11 server id 10966 end_log_pos 18899998 CRC32 0xcb566154 Update_rows: table id 115 flags: STMT_END_F
### UPDATE `fatsys`.`t_trans_purchase_record`
### WHERE
### @1='20291109104309003288'
### SET
### @25='5'
# at 18899998
[mysql@kfcsfatsdb1 binlog]$
是一条insert 语句导致。因为报错信息不够详细,现通过相关视图查找更加有用的信息
(hxbmysqladmin@localhost) [performance_schema]> show tables like '%replica%';
+-------------------------------------------+
| Tables_in_performance_schema (%replica%) |
+-------------------------------------------+
| replication_applier_configuration |
| replication_applier_status |
| replication_applier_status_by_coordinator |
| replication_applier_status_by_worker |
| replication_connection_configuration |
| replication_connection_status |
| replication_group_member_stats |
| replication_group_members |
+-------------------------------------------+
8 rows in set (0.00 sec)
(hxbmysqladmin@localhost) [performance_schema]> select * from replication_applier_status_by_worker \G;
*************************** 1. row ***************************
CHANNEL_NAME:
WORKER_ID: 1
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: caf16e55-32f0-11e8-8f98-0050560060d9:1355657
LAST_ERROR_NUMBER: 1677
LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'caf16e55-32f0-11e8-8f98-0050560060d9:1355657' at master log mysql-bin.000003, end_log_pos 18899630; Column 17 of table 'fatsys.t_trans_purchase_record' cannot be converted from type 'decimal(16,2)' to type 'decimal(16,4)'
LAST_ERROR_TIMESTAMP: 2018-08-06 10:52:16
*************************** 2. row ***************************
CHANNEL_NAME:
WORKER_ID: 2
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION:
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 3. row ***************************
CHANNEL_NAME:
WORKER_ID: 3
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION:
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
*************************** 4. row ***************************
CHANNEL_NAME:
WORKER_ID: 4
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION:
LAST_ERROR_NUMBER: 0
LAST_ERROR_MESSAGE:
LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
4 rows in set (0.00 sec)
ERROR:
No query specified
错误非常明显 'fatsys.t_trans_purchase_record' cannot be converted from type 'decimal(16,2)' to type 'decimal(16,4)'
查一下表结果,发现无异常
(hxbmysqladmin@localhost) [fatsys]> desc `fatsys`.`t_trans_purchase_record`
-> ;
+------------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+---------------+------+-----+---------+-------+
| sys_serialno | varchar(24) | NO | PRI | NULL | |
| order_serialno | varchar(24) | YES | | NULL | |
| thd_serialno | varchar(24) | YES | | NULL | |
| wag_flg | varchar(1) | NO | | NULL | |
| account_type | varchar(1) | YES | | NULL | |
| account_no | varchar(32) | NO | | NULL | |
| product_name | varchar(40) | YES | | NULL | |
| product_code | varchar(32) | NO | | NULL | |
| trans_share | decimal(16,2) | NO | | NULL | |
| trans_amt | decimal(16,2) | NO | | NULL | |
| busi_type | varchar(1) | YES | | NULL | |
| sell_charge | decimal(16,2) | YES | | 0.00 | |
| buy_charge | decimal(16,2) | YES | | 0.00 | |
| trans_time | varchar(6) | YES | | NULL | |
| trans_date | varchar(8) | YES | | NULL | |
| share_account_date | varchar(8) | YES | | NULL | |
| product_end_date | varchar(8) | YES | | NULL | |
| annual_rate | decimal(16,4) | YES | | 0.0000 | |
| cust_name | varchar(60) | YES | | NULL | |
| remain_days | varchar(6) | YES | | NULL | |
| trans_channl | varchar(4) | YES | | NULL | |
| channl_date | varchar(8) | YES | | NULL | |
| channl_time | varchar(9) | YES | | NULL | |
| channl_serialno | varchar(32) | YES | | NULL | |
| trans_status | varchar(1) | NO | | NULL | |
| curr_type | varchar(3) | YES | | NULL | |
| check_status | varchar(1) | YES | | NULL | |
| ret_code | varchar(6) | YES | | NULL | |
| ret_msg | varchar(255) | YES | | NULL | |
| pre_code | varchar(6) | YES | | NULL | |
| pre_msg | varchar(255) | YES | | NULL | |
| remark | varchar(32) | YES | | NULL | |
| phone_no | varchar(11) | YES | | NULL | |
| trans_code | varchar(6) | YES | | NULL | |
| lc_unusual_query_count | int(11) | YES | | 0 | |
+------------------------+---------------+------+-----+---------+-------+
35 rows in set (0.00 sec)
一般这种情况是有人手动操作过表结构所致,咨询开发,确实改过字段精度
stop slave;
set global slave_type_conversions=ALL_NON_LOSSY;
start slave;
几种值的设置:
ALL_LOSSY:允许数据截断
ALL_NON_LOSSY:不允许数据截断,如果从库类型大于主库类型,是可以复制的,反过了,就不行了,从库报复制错误,复制终止。
ALL_LOSSY,ALL_NON_LOSSY: 所有允许的转换都会执行,而不管是不是数据丢失。
空值(不设置):要求主从库的数据类型必须严格一致,否则都报错。