目录
环境介绍
一,主库执行delete,从库没有该数据
模拟故障
修复故障
二,主库执行insert,从库已存在该数据
模拟故障
故障恢复
三,主库执行update,从库没有该数据
模拟故障
故障恢复
四,从库relay log文件损坏/缺失
故障模拟
故障恢复
环境介绍
数据库1主2从,mysql版本是v5.19
表结构
一,主库执行delete,从库没有该数据
模拟故障
1.1, 在slave1中奖id = 1的数据delete from hero2 where id = 1
1.2, 在master上执行该数据的delete from hero2 where id = 1
1.3, slave1 上show slave status\G查看
delete from hero2 where id = 1;
master上执行后
mysql> show slave status\G
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000038Read_Master_Log_Pos: 463Relay_Log_File: mha2-relay-bin.000020Relay_Log_Pos: 313Relay_Master_Log_File: mysql-bin.000038Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1032Last_Error: Could not execute Delete_rows event on table test.hero2; Can't find record in 'hero2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000038, end_log_pos 432
修复故障
由于master上删一条记录,而slave上找不到,那么slave上直接跳过即可;
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)mysql> set global sql_slave_skip_counter = 1;
Query OK, 0 rows affected (0.00 sec)mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
或者在my.cnf文件中添加slave-skip-errors = 1032 并重启生效,可以跳过所有这种情况的同步错误
二,主库执行insert,从库已存在该数据
模拟故障
2.1 slave 上insert id = 5的数据
2.2 master 上 insert id = 5的数据
2.3 从库上执行show slave status查看同步失败
主从同步异常报错信息
ysql> show slave status\G
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000039Read_Master_Log_Pos: 469Relay_Log_File: mha2-relay-bin.000023Relay_Log_Pos: 360Relay_Master_Log_File: mysql-bin.000039Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062Last_Error: Could not execute Write_rows event on table test.hero2; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000039, end_log_pos 438
故障恢复
从库已经有该记录,导致这条记录无法顺利插入,从而导致主从同步异常;这种情况就需要在从库上将多余的记录给delete。
报错信息:Write_rows event on table test.hero2; Duplicate entry '5' for key 'PRIMARY'
报错日志中显示了表名和主键id,所以在slave上执行下面语句:
delete from hero2 where id = 5;
stop slave; start slave;
三,主库执行update,从库没有该数据
模拟故障
3.1 slave上将id = 2的数据delete
mysql> delete from hero2 where id = 2;
Query OK, 1 row affected (0.00 sec)
3.2 master上执行update hero2 set age = 200 where id = 2;
mysql> update hero2 set age = 200 where id = 2;
Query OK, 1 row affected (0.04 sec)
Rows matched: 1 Changed: 1 Warnings: 0
3.3 slave上查看主从同步状态
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000039Read_Master_Log_Pos: 757Relay_Log_File: mha2-relay-bin.000024Relay_Log_Pos: 360Relay_Master_Log_File: mysql-bin.000039Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1032Last_Error: Could not execute Update_rows event on table test.hero2; Can't find record in 'hero2', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000039, end_log_pos 726
故障恢复
根据报错信息中的mysql-bin.000039, end_log_pos 726,在主库上查看binlog
[root@mha1 mysql]# mysqlbinlog --no-defaults -vv --base64-output=decode-rows mysql-bin.000039 | grep -A 20 'end_log_pos 726'
#241221 11:27:47 server id 10129 end_log_pos 726 CRC32 0xb8c2e5cc Update_rows: table id 221 flags: STMT_END_F
### UPDATE `test`.`hero2`
### WHERE
### @1=2 /* INT meta=0 nullable=0 is_null=0 */
### @2='dragon' /* VARSTRING(60) meta=60 nullable=0 is_null=0 */
### @3=1001 /* INT meta=0 nullable=0 is_null=0 */
### SET
### @1=2 /* INT meta=0 nullable=0 is_null=0 */
### @2='dragon' /* VARSTRING(60) meta=60 nullable=0 is_null=0 */
### @3=200 /* INT meta=0 nullable=0 is_null=0 */
# at 726
#241221 11:27:47 server id 10129 end_log_pos 757 CRC32 0x3db3f01e Xid = 60
COMMIT/*!*/;
将变更前的数据整理为insert,并插入slave中:然后stop slave; start slave即可;
mysql> insert into `test`.`hero2` values(2,'dragon',1001);
Query OK, 1 row affected (0.05 sec)mysql> stop slave; start slave;
Query OK, 0 rows affected (0.00 sec)Query OK, 0 rows affected (0.00 sec)mysql> show slave status\G
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000039Read_Master_Log_Pos: 757Relay_Log_File: mha2-relay-bin.000025Relay_Log_Pos: 360Relay_Master_Log_File: mysql-bin.000039Slave_IO_Running: YesSlave_SQL_Running: Yes
四,从库relay log文件损坏/缺失
故障模拟
4.1 slave上执行stop slave sql_thread;
4.2 master上执行dml操作,从而产生binlog
4.3 slave上删除relay-log
[root@mha2 mysql]# ls mha2-relay-bin.*
mha2-relay-bin.000030 mha2-relay-bin.000032 mha2-relay-bin.000034 mha2-relay-bin.000036 mha2-relay-bin.000038
mha2-relay-bin.000031 mha2-relay-bin.000033 mha2-relay-bin.000035 mha2-relay-bin.000037 mha2-relay-bin.index
[root@mha2 mysql]# \rm mha2-relay-bin.*
[root@mha2 mysql]# ls mha2-relay-bin.*
4.4 stat slave sql_thread; 并查看主从状态,报错1593
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000042Read_Master_Log_Pos: 194Relay_Log_File: mha2-relay-bin.000031Relay_Log_Pos: 360Relay_Master_Log_File: mysql-bin.000039Slave_IO_Running: YesSlave_SQL_Running: NoReplicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1593Last_Error: Error initializing relay log position: Could not open log fileSkip_Counter: 0Exec_Master_Log_Pos: 1563Relay_Log_Space: 32126695Until_Condition: None
4.5 查看主库数据
mysql> select count(1) from hero2;
+----------+
| count(1) |
+----------+
| 1835008 |
+----------+
1 row in set (0.59 sec)
查看从库数据
mysql> select count(1) from hero2;
+----------+
| count(1) |
+----------+
| 11 |
+----------+
1 row in set (0.00 sec)
此时许多数据在relay-log里面丢失了
故障恢复
根据Relay_Master_Log_File: mysql-bin.000039 和Exec_Master_Log_Pos: 1563
来重新搭建主从关系
[root@mha2 server]# sh install_mysqlv57.sh -m 192.168.10.129 mysql-bin.000039 1563
Shutting down MySQL.... SUCCESS!
Starting MySQL. SUCCESS! slave info:
*************************** 1. row ***************************Slave_IO_State: Queueing master event to the relay logMaster_Host: 192.168.10.129Master_User: replMaster_Port: 3358Connect_Retry: 60Master_Log_File: mysql-bin.000039Read_Master_Log_Pos: 17543128Relay_Log_File: mha2-relay-bin.000002Relay_Log_Pos: 10045Relay_Master_Log_File: mysql-bin.000039Slave_IO_Running: YesSlave_SQL_Running: YesReplicate_Do_DB:
待Seconds_Behind_Master = 0后查看表的数据量确认和主库保持一致,恢复完成
mysql> select count(1) from hero2;
+----------+
| count(1) |
+----------+
| 1835008 |
+----------+
1 row in set (0.59 sec)mysql>