一、背景:
从数据库报错、或者误删sql、无法正常从主库同步数据,举例如下:
MariaDB [test]> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.10.135
Master_User: jfedu-cong2
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: jfedu-zhu2.000005
Read_Master_Log_Pos: 446
Relay_Log_File: mariadb-relay-bin.000022
Relay_Log_Pos: 747
Relay_Master_Log_File: jfedu-zhu2.000004
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'zabbix'. Query: 'INSERT INTO auditlog (userid,clock,ip,action,resourcetype,details,auditid) VALUES ('1','1619014990','192.168.10.1','3','0','Login failed.','1')'
Skip_Counter: 0
Exec_Master_Log_Pos: 462
Relay_Log_Space: 18224867
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'zabbix'. Query: 'INSERT INTO auditlog (userid,clock,ip,action,resourcetype,details,auditid) VALUES ('1','1619014990','192.168.10.1','3','0','Login failed.','1')'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)
二、恢复和故障处理
1、从主库上dump全量数据,导出到文件
mysqldump -uroot -p -A --master-data=2 >/data/mysql_all.sql
–master-data选项的作用就是将二进制的信息写入到输出文件中,在这里是写入到mysql_all.sql文件中,通过这个选项会有一条CHANGE MASTER TO MASTER_LOG_FILE=‘xxxx.xxx’, MASTER_LOG_POS=xx 的记录产生到mysql_all.sql文件,为后面的增量同步从何处开始做准备
- 另外此时增加一个insert操作,模拟主库未停止更新的操作:
MariaDB [(none)]> use jfedu_tb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
MariaDB [jfedu_tb]> select * from jfedu;
+------+----------+
| id | name |
+------+----------+
| 3 | xiaoming |
| 12 | |
| 23 | |
| 11 | xiaohong |
| 11 | xiaohong |
| 33 | ming |
| 232 | sds |
+------+----------+
7 rows in set (0.00 sec)
MariaDB [jfedu_tb]> insert into jfedu values(232,"sds");
Query OK, 1 row affected (0.00 sec)
2、将该文件传到从库主机上,并导入
[root@node2 ~]# scp /data/mysql_all.sql node1:/data/
root@node1's password:
mysql_all.sql 100% 506KB 63.2MB/s 00:00
[root@node2 ~]#
从库导入全量数据:
MariaDB [test]> source /data/mysql_all.sql
3、从库上指定MASTER_log 文件的pos,使其从全量数据备份开始的位置开始同步
- 1) 主库上查看全量备份时pos的值,到mysql的data目录下确认binlog文件的文件名:
[root@node2 mysql]# ls -ltr
总用量 93328
drwx------ 2 mysql mysql 6 4月 3 00:53 test
drwx------ 2 mysql mysql 4096 4月 3 00:53 performance_schema
-rw-rw---- 1 mysql mysql 2224 4月 3 04:32 jfedu-zhu2.000001
-rw-rw---- 1 mysql mysql 52 4月 3 04:32 aria_log_control
-rw-rw---- 1 mysql mysql 16384 4月 3 04:32 aria_log.00000001
-rw-rw---- 1 mysql mysql 245 4月 3 04:33 jfedu-zhu2.000002
-rw-rw---- 1 mysql mysql 21719487 4月 21 00:18 jfedu-zhu2.000003
-rw-rw---- 1 mysql mysql 5242880 4月 21 21:47 ib_logfile1
-rw-rw---- 1 mysql mysql 18221476 4月 21 23:37 jfedu-zhu2.000004
srwxrwxrwx 1 mysql mysql 0 4月 22 2021 mysql.sock
-rw-rw---- 1 mysql mysql 100 4月 22 2021 jfedu-zhu2.index
-rw-rw---- 1 mysql mysql 580 4月 22 2021 mariadb-relay-bin.000024
-rw-rw---- 1 mysql mysql 54 4月 22 2021 mariadb-relay-bin.index
-rw-rw---- 1 mysql mysql 446 4月 22 2021 jfedu-zhu2.000005
-
2)结合mysql的配置文件看,确定binlog的文件是jfedu-zhu2.000005
-
3) 用grep指令确定binlog 的pos值:
[root@node2 ~]# grep MASTER_LOG_FILE /data/mysql_all.sql |grep jfedu-zhu2.000005
-- CHANGE MASTER TO MASTER_LOG_FILE='jfedu-zhu2.000005', MASTER_LOG_POS=245;
- 4) 停止从库的slave
MariaDB [test]> stop slave
-> ;
Query OK, 0 rows affected (0.00 sec) - 5)并修改从库的pos值:
MariaDB [test]> change master to master_host=“192.168.10.135”
-> ,master_user=“jfedu-cong2”,
-> master_password=“cui0116”,
-> master_log_file=“jfedu-zhu2.000005”,
-> master_log_pos=245; - 6) 启动slave,查看slave状态已恢复正常
MariaDB [test]> start slave;
Query OK, 0 rows affected (0.00 sec)
MariaDB [test]> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.10.135
Master_User: jfedu-cong2
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: jfedu-zhu2.000005
Read_Master_Log_Pos: 446
Relay_Log_File: mariadb-relay-bin.000002
Relay_Log_Pos: 731
Relay_Master_Log_File: jfedu-zhu2.000005
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 446
Relay_Log_Space: 1027
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
1 row in set (0.00 sec)
- 7)确认同步在全量同步后,有进行增量同步,即从主库dump之后的更新操作也被同步了过来:
从库上:
MariaDB [jfedu_tb]> select * from jfedu;
+------+----------+
| id | name |
+------+----------+
| 3 | xiaoming |
| 12 | |
| 23 | |
| 11 | xiaohong |
| 11 | xiaohong |
| 33 | ming |
| 232 | sds |
| 232 | sds |
+------+----------+
其中的第二天232,sds即主库在dump之后进行的更新操作,说明此时主从同步是正常的了。