数据库架构:一主两从
master:192.168.0.150
slave1:192.168.0.104
slave1:192.168.0.188
manager:192.168.0.132
四台服务器都关闭防火墙
[root@localhost ~]# systemctl stop firewalld 或者 systemctl disable firewalld
MHA软件包下载:下载链接
Manager工具包主要包括以下几个工具:
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manger 启动MHA
masterha_check_status 检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息
Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞SQL线程)
一、基于GTID 配置msyql5.7 实现 主从复制
1.在配置文件添加,三台服务器一样,只需修改server-id。
注意:binlog-do-db 和 replicate-ignore-db 设置必须相同。 MHA 在启动时候会检测过滤规则,如果过滤规则不同,MHA 不启动监控和故障转移,这里没有设置。
server-id=266
log-bin=mysql-bin
gtid_mode = on
#开启gtid,必须主从全开
enforce_gtid_consistency = 1
log_slave_updates = 1
#开启半同步复制 否则自动切换主从的时候会报主键错误
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
2.在三台服务器上配置复制用户和监控用户,三台服务器都要添加
添加复制用户
grant replication slave on *.* to 'repl'@'192.168.0.%' identified by '123456';
//创建一个repl 用户
添加监控用户
grant all privileges on *.* to 'root'@'192.168.0.%' indentified by '123456';
//创建一个root用户拥有监控
在salve1 和salve2上执行
change master to master_host='192.168.0.150',master_user='repl',master_password='123456',master_auto_position;
#设置从服务器只读,不要在配置文件里写,重点!!!不然不小心从服务器写入了数据,有你哭的
set global read_only=1
在slave1和slave2 msyql中输入:
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.150
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000017
Read_Master_Log_Pos: 1225
Relay_Log_File: localhost-relay-bin.000002
Relay_Log_Pos: 679
Relay_Master_Log_File: mysql-bin.000017
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Slave_IO_Running: Yes 和 Slave_SQL_Running: Yes 都为Yes 表示主从复制配置成功
二、安装MHA
1.为防止出先不明错误四台虚拟机都安装依赖包
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
安装node (三台服务器都安装)
tar -zxvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install
可能报错:perl Makefile.PL
[root@localhost mha4mysql-node-0.58]# perl Makefile.PL;make;make install
Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Makefile.pm line 4.
BEGIN failed--compilation aborted at inc/Module/Install/Makefile.pm line 4.
Compilation failed in require at inc/Module/Install.pm line 307.
Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm line 6.
BEGIN failed--compilation aborted at inc/Module/Install/Can.pm line 6.
Compilation failed in require at inc/Module/Install.pm line 307.
Can't locate ExtUtils/MM_Unix.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Metadata.pm line 316.
make: *** 没有指明目标并且找不到 makefile。 停止。
make: *** 没有规则可以创建目标“install”。 停止。
如果报错:解决办法
yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker
在安装一次
安装完成后会在/usr/local/bin目录下面生成相应的脚本
[root@localhost bin]# ll
总用量 48
-r-xr-xr-x. 1 root root 17639 8月 30 10:36 apply_diff_relay_logs
-r-xr-xr-x. 1 root root 4807 8月 30 10:36 filter_mysqlbinlog
-r-xr-xr-x. 1 root root 8337 8月 30 10:36 purge_relay_logs
-r-xr-xr-x. 1 root root 7525 8月 30 10:36 save_binary_logs
只在manager 虚拟机 安装MHA manager 包
tar xf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install
安装完成后会在/usr/local/bin目录下面生成相应的脚本
[root@localhost bin]# ll
总用量 88
-r-xr-xr-x. 1 root root 17639 8月 30 10:56 apply_diff_relay_logs
-r-xr-xr-x. 1 root root 4807 8月 30 10:56 filter_mysqlbinlog
-r-xr-xr-x. 1 root root 1995 8月 30 11:02 masterha_check_repl
-r-xr-xr-x. 1 root root 1779 8月 30 11:02 masterha_check_ssh
-r-xr-xr-x. 1 root root 1865 8月 30 11:02 masterha_check_status
-r-xr-xr-x. 1 root root 3201 8月 30 11:02 masterha_conf_host
-r-xr-xr-x. 1 root root 2517 8月 30 11:02 masterha_manager
-r-xr-xr-x. 1 root root 2165 8月 30 11:02 masterha_master_monitor
-r-xr-xr-x. 1 root root 2373 8月 30 11:02 masterha_master_switch
-r-xr-xr-x. 1 root root 5172 8月 30 11:02 masterha_secondary_check
-r-xr-xr-x. 1 root root 1739 8月 30 11:02 masterha_stop
-r-xr-xr-x. 1 root root 8337 8月 30 10:56 purge_relay_logs
-r-xr-xr-x. 1 root root 7525 8月 30 10:56 save_binary_logs
复制相关脚本到/usr/local/bin目录(软件包解压缩后就有了,不是必须,因为这些脚本不完整,需要自己修改,这是软件开发着留给我们自己发挥的,如果开启下面的任何一个脚本对应的参数,而对应这里的脚本又没有修改,则会抛错,自己被坑的很惨)
[root@localhost scripts]# cp * /usr/local/bin/
master_ip_failover #自动切换时vip管理的脚本,不是必须,如果我们使用keepalived的,我们可以自己编写脚本完成对vip的管理,比如监控mysql,如果mysql异常,我们停止keepalived就行,这样vip就会自动漂移
master_ip_online_change #在线切换时vip的管理,不是必须,同样可以可以自行编写简单的shell完成
power_manager #故障发生后关闭主机的脚本,不是必须
send_report #因故障切换后发送报警的脚本,不是必须,可自行编写简单的shell完成。
3.配置SSH登录无密码验证
四台服务器都进行免认证
ssh-keygen
sh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
sh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
sh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
4.添加修改MHA配置文件
在master虚拟机上操作
创建配置文件
mkdir -p /etc/masterha
cp mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/
修改配置app1.cnf
[root@localhost bin]# vi /etc/masterha/app1.cnf
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1.log
master_binlog_dir=/var/lib/mysql //设置master 保存binlog的位置,以便MHA可以找到master的日志,我这里的yum安装mysql7在/var/lib/mysql的数据目录
master_ip_failover_script=/usr/local/bin/master_ip_failover //设置自动failover时候的切换脚本,也就是上边的哪个脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change //设置手动切换时候的切换脚本
user=root//设置监控用户root
password=123456 //设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码
ping_interval=1 //设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行railover
remote_workdir=/tmp
repl_password=123456 //设置复制用户的密码
repl_user=repl //设置复制用户
report_script=/usr/local/send_report //设置发生切换后发送的报警的脚本
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.0.104 -s 192.168.0.188 -s 192.168.0.150
shutdown_script=""
ssh_user=root //设置ssh的登录用户名
[server1]
candidate_master=1
hostname=192.168.0.150
port=3306
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.0.104
port=3306
[server3]
hostname=192.168.0.188
port=3306
~
5.测试连接情况
测试ssh的连接情况
[root@localhost bin]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Wed Aug 29 16:12:44 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Aug 29 16:12:44 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed Aug 29 16:12:44 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed Aug 29 16:12:44 2018 - [info] Starting SSH connection tests..
Wed Aug 29 16:12:45 2018 - [debug]
Wed Aug 29 16:12:44 2018 - [debug] Connecting via SSH from [email protected](192.168.0.150:22) to [email protected](192.168.0.104:22)..
Wed Aug 29 16:12:45 2018 - [debug] ok.
Wed Aug 29 16:12:45 2018 - [debug] Connecting via SSH from [email protected](192.168.0.150:22) to [email protected](192.168.0.188:22)..
Wed Aug 29 16:12:45 2018 - [debug] ok.
Wed Aug 29 16:12:46 2018 - [debug]
Wed Aug 29 16:12:45 2018 - [debug] Connecting via SSH from [email protected](192.168.0.104:22) to [email protected](192.168.0.150:22)..
Wed Aug 29 16:12:45 2018 - [debug] ok.
Wed Aug 29 16:12:45 2018 - [debug] Connecting via SSH from [email protected](192.168.0.104:22) to [email protected](192.168.0.188:22)..
Wed Aug 29 16:12:46 2018 - [debug] ok.
Wed Aug 29 16:12:47 2018 - [debug]
Wed Aug 29 16:12:45 2018 - [debug] Connecting via SSH from [email protected](192.168.0.188:22) to [email protected](192.168.0.150:22)..
Wed Aug 29 16:12:46 2018 - [debug] ok.
Wed Aug 29 16:12:46 2018 - [debug] Connecting via SSH from [email protected](192.168.0.188:22) to [email protected](192.168.0.104:22)..
Wed Aug 29 16:12:46 2018 - [debug] ok.
Wed Aug 29 16:12:47 2018 - [info] All SSH connection tests passed successfully.
出现successful ssh免密码登入验证成功
可能出现的问题
[root@localhost bin]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Thu Aug 30 13:46:55 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
/etc/masterha/app1.cnf:
at /usr/local/share/perl5/MHA/SSHCheck.pm line 148.
解决
最好面发现是配置写错了
测试mysq集群的连接情况
[root@localhost bin]# masterha_check_repl --conf=/etc/masterha/app1.cnf
...........
...........
192.168.0.150(192.168.0.150:3306) (current master)
+--192.168.0.104(192.168.0.104:3306)
+--192.168.0.188(192.168.0.188:3306)
Wed Aug 29 16:14:23 2018 - [info] Checking replication health on 192.168.0.104..
Wed Aug 29 16:14:23 2018 - [info] ok.
Wed Aug 29 16:14:23 2018 - [info] Checking replication health on 192.168.0.188..
Wed Aug 29 16:14:23 2018 - [info] ok.
Wed Aug 29 16:14:23 2018 - [info] Checking master_ip_failover_script status:
Wed Aug 29 16:14:23 2018 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.0.150 --orig_master_ip=192.168.0.150 --orig_master_port=3306
IN SCRIPT TEST====/sbin/ifconfig enp0s3:1 down==/sbin/ifconfig enp0s3:1 192.168.0.88/24===
...........
mysql 集群连接成功
可能出现问题
Thu Aug 30 14:27:08 2018 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln492] Server 192.168.0.180(192.168.0.180:3306) is dead, but must be alive! Check server settings.
Thu Aug 30 14:27:08 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/local/share/perl5/MHA/MasterMonitor.pm line 402.
Thu Aug 30 14:27:08 2018 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Aug 30 14:27:08 2018 - [info] Got exit code 1 (Not master dead).
解决问题
// 关闭防火墙
启动MHA
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
–remove_dead_master_conf 该参数代表当发生主从切换后,老的主库的ip将会从配置文件中移除。
–manger_log 日志存放位置
–ignore_last_failover 在缺省情况下,如果MHA检测到连续发生宕机,且两次宕机间隔不足8小时的话,则不会进行Failover,之所以这样限制是为了避免ping-pong效应。该参数代表忽略上次MHA触发切换产生的文件,默认情况下,MHA发生切换后会在日志目录,也就是上面我设置的/data产生app1.failover.complete文件,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换,除非在第一次切换后收到删除该文件,为了方便,这里设置为–ignore_last_failover
6.检查MHA的启动状态
masterha_check_status --conf=/etc/masterha/app1.cnf
[root@localhost bin]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:3650) is running(0:PING_OK), master:192.168.0.150
停止MHA
masterha_stop –conf=/etc/masterha/app1.cnf
7.配置MHA
修改/usr/local/bin/master_ip_failover,这里使用脚本管理vip
将如下代码全部复制进去,根据自己的实际情况进行修改
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.0.88/24'; #此处为你要设置的虚拟ip
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig enp0s3:$key $vip"; #此处改为你的网卡名称
my $ssh_stop_vip = "/sbin/ifconfig enp0s3:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
[root@localhost]# ifconfig enp0s3:1 192.168.0.88/24
查看vip 是否成功
[root@localhost ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:d4:a5:47 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.150/16 brd 192.168.255.255 scope global enp0s3
valid_lft forever preferred_lft forever
inet 192.168.0.88/24 brd 192.168.3.255 scope global enp0s3:1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fed4:a547/64 scope link
valid_lft forever preferred_lft forever
8.测试MHA
杀死 master 中的mysql
[root@localhost ~]# systemctl stop mysqld
在mannger 中 输入
[root@localhost app1]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
[1]+ 完成 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1
//可以发现监控已经断开,重新开启
[root@localhost app1]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 7332
[root@localhost app1]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:7332) is running(0:PING_OK), master:192.168.0.104
// 现在的master 服务器已经转到从服务器了
测试参考连接:https://www.cnblogs.com/rayment/p/7355093.html
参考连接:https://blog.csdn.net/qq_34605594/article/details/77387872?locationNum=4&fps=1