MySQL5.7 MHA+MaxScale2.0构建高可用环境

MySQL读写分离与负载均衡--MHA与MaxScale
环境介绍
Part1:写在最前
看了某大牛的文章,讲述了一下MaxScale比LVS的好处多多,那您倒是放出来配置文件啊~~大牛说:
需要的单独找我吧,太长了配置文件……
看到这我心中久久不能平静啊。。。联系不上您呐 = =,于是各种资料各种找啊~各种坑各种血崩啊!~~~
由于不知道大牛的配置文件是什么样子,本文仅以随笔的形式,记录下实施过程。也欢迎您和我探讨您在实施MaxScale时遇到的各种问题和心得体会。

Part2:环境
MySQL5.7 MHA + MaxScale2.0
192.168.1.248 HE1 slave1
192.168.1.249 HE2 slave2
192.168.1.250 HE3 master
192.168.1.251 HE4 MHA-manager

Part3:MHA
MHA的优点不作赘述,看下原理图吧

从宕机崩溃的Master保存二进制日志事件(binlogevent)
识别含有最新更新的Slave
应用差异的中继日志(relaylog)到其他Slave
应用从Master保存的二进制日志事件
提升一个Slave为新的Master
使其他的Slave连接新的Master进行复制

构建MySQL5.7MHA
Part1:写在最前
MHA的部署不是本文的叙述重点,网上比比皆是。这里只记录下MySQL5.7的MHA搭建时的一些坑

Part2:坑
①mha4mysql-manager-0.57.tar.gz
②mha4mysql-node-0.57.tar.gz
Warning:警告这两个包首先你要搞到,虽然说0.56什么的不代表支持的mysql版本,但经过测试,想要在MySQL5.7上部署MHA,少走坑,请用0.57的。

Part3:安装包的位置
请在所有的节点包括Manager节点安装好你的mha4mysql-node-0.57.tar.gz,以保证后期在管理节点执行perl Makefile.PL的时候,你能如愿以偿的看到如下信息
[root@HE4 mha4mysql-manager-0.57]# perl Makefile.PL 
*** Module::AutoInstall version 1.06
*** Checking for Perl dependencies...
[Core Features]
- DBI                  ...loaded. (1.609)
- DBD::mysql            ...loaded. (4.013)
- Time::HiRes          ...loaded. (1.9721)
- Config::Tiny          ...loaded. (2.12)
- Log::Dispatch        ...loaded. (2.26)
- Parallel::ForkManager ...loaded. (0.7.5)
- MHA::NodeConst        ...loaded. (0.57)
*** Module::AutoInstall configuration finished.
Writing Makefile for mha4mysql::manager

Part4:一些常见错误记录
如果遇到
①这样
[root@HE2 bin]# masterha_check_repl --conf=/etc/mha/mha.conf 
Tue Apr  5 22:09:32 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Apr  5 22:09:32 2016 - [info] Reading application default configuration from /etc/mha/mha.conf..
Tue Apr  5 22:09:32 2016 - [info] Reading server configuration from /etc/mha/mha.conf..
Tue Apr  5 22:09:32 2016 - [info] MHA::MasterMonitor version 0.57.
Tue Apr  5 22:09:32 2016 - [error][/usr/local/lib64/perl5/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover
Tue Apr  5 22:09:32 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/local/lib64/perl5/MHA/MasterMonitor.pm line 326
Tue Apr  5 22:09:32 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Tue Apr  5 22:09:32 2016 - [info] Got exit code 1 (Not master dead).

解决方案
用的不是默认端口3306,请修改你的配置文件

②这样
[root@HE4 ~]#
masterha_check_repl --conf=/etc/mha/mha.conf
Tue Apr  5 22:36:33 2016 - [warning] Global
configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Apr  5 22:36:33 2016 - [info] Reading application
default configuration from /etc/mha/mha.conf..
Tue Apr  5 22:36:33 2016 - [info] Reading server
configuration from /etc/mha/mha.conf..
Tue Apr  5 22:36:33 2016 - [info] MHA::MasterMonitor
version 0.57.
Tue Apr  5 22:36:34 2016 - [info] GTID failover mode =
0
Tue Apr  5 22:36:34 2016 - [info] Dead Servers:
Tue Apr  5 22:36:34 2016 - [info] Alive Servers:
Tue Apr  5 22:36:34 2016 - [info]  192.168.1.250(192.168.1.250:4008)
Tue Apr  5 22:36:34 2016 - [info]  192.168.1.248(192.168.1.248:4008)
Tue Apr  5 22:36:34 2016 - [info]  192.168.1.249(192.168.1.249:4008)
Tue Apr  5 22:36:34 2016 - [info] Alive Slaves:
Tue Apr  5 22:36:34 2016 - [info]  192.168.1.248(192.168.1.248:4008)  Version=5.6.16-log (oldest major version
between slaves) log-bin:enabled
Tue Apr  5 22:36:34 2016 - [info]    Replicating from
192.168.1.250(192.168.1.250:4008)
Tue Apr  5 22:36:34 2016 - [info]    Primary candidate for the new Master
(candidate_master is set)
Tue Apr  5 22:36:34 2016 - [info]  192.168.1.249(192.168.1.249:4008)  Version=5.6.16-log (oldest major version
between slaves) log-bin:enabled
Tue Apr  5 22:36:34 2016 - [info]    Replicating from
192.168.1.250(192.168.1.250:4008)
Tue Apr  5 22:36:34 2016 - [info]    Not candidate for the new Master
(no_master is set)
Tue Apr  5 22:36:34 2016 - [info] Current Alive
Master: 192.168.1.250(192.168.1.250:4008)
Tue Apr  5 22:36:34 2016 - [info] Checking slave
configurations..
Tue Apr  5 22:36:34 2016 - [warning]  relay_log_purge=0 is not set on slave
192.168.1.248(192.168.1.248:4008).
Tue Apr  5 22:36:34 2016 - [warning]  relay_log_purge=0 is not set on slave
192.168.1.249(192.168.1.249:4008).
Tue Apr  5 22:36:34 2016 - [info] Checking replication
filtering settings..
Tue Apr  5 22:36:34 2016 - [info]  binlog_do_db= , binlog_ignore_db=
Tue Apr  5 22:36:34 2016 - [info]  Replication filtering check ok.
Tue Apr  5 22:36:34 2016 - [info] GTID (with auto-pos)
is not supported
Tue Apr  5 22:36:34 2016 - [info] Starting SSH
connection tests..
Tue Apr  5 22:36:35 2016 - [info] All SSH connection
tests passed successfully.
Tue Apr  5 22:36:35 2016 - [info] Checking MHA Node
version..
Tue Apr  5 22:36:36 2016 - [info]  Version check ok.
Tue Apr  5 22:36:36 2016 - [info] Checking SSH
publickey authentication settings on the current master..
Tue Apr  5 22:36:36 2016 - [info] HealthCheck: SSH to
192.168.1.250 is reachable.
Tue Apr  5 22:36:36 2016 - [info] Master MHA Node
version is 0.57.
Tue Apr  5 22:36:36 2016 - [info] Checking recovery
script configurations on 192.168.1.250(192.168.1.250:4008)..
Tue Apr  5 22:36:36 2016 - [info]  Executing command: save_binary_logs
--command=test --start_pos=4 --binlog_dir=/log/mysql
--output_file=/usr/local/mha/save_binary_logs_test --manager_version=0.57
--start_file=mysql-bin.000009
Tue Apr  5 22:36:36 2016 - [info]  Connecting to
[email protected](192.168.1.250:22)..
  Creating /usr/local/mha if not exists..
Creating directory /usr/local/mha.. done.
  ok.
  Checking output directory is accessible or
not..
  ok.
  Binlog found at /log/mysql, up to
mysql-bin.000009
Tue Apr  5 22:36:36 2016 - [info] Binlog setting check
done.
Tue Apr  5 22:36:36 2016 - [info] Checking SSH
publickey authentication and checking recovery script configurations on all
alive slave servers..
Tue Apr  5 22:36:36 2016 - [info]  Executing command : apply_diff_relay_logs
--command=test --slave_user='root' --slave_host=192.168.1.248
--slave_ip=192.168.1.248 --slave_port=4008 --workdir=/usr/local/mha
--target_version=5.6.16-log --manager_version=0.57
--relay_log_info=/data/mysql/relay-log.info 
--relay_dir=/data/mysql/
--slave_pass=xxx
Tue Apr  5 22:36:36 2016 - [info]  Connecting to
[email protected](192.168.1.248:22)..
Can't exec
"mysqlbinlog": No such file or directory at
/usr/local/lib64/perl5/MHA/BinlogManager.pm line 106.
mysqlbinlog version
command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client
options
 at /usr/local/bin/apply_diff_relay_logs line
493
Tue Apr  5 22:36:36 2016 -
[error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln205] Slaves settings
check failed!
Tue Apr  5 22:36:36 2016 -
[error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln413] Slave configuration
failed.
Tue Apr  5 22:36:36 2016 -
[error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln424] Error happened on
checking configurations.  at
/usr/local/bin/masterha_check_repl line 48
Tue Apr  5 22:36:36 2016 -
[error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln523] Error happened on
monitoring servers.
Tue Apr  5 22:36:36 2016 - [info] Got exit code 1 (Not
master dead).
 
MySQL Replication
Health is NOT OK!

解决方案
[root@HE1 MHA]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog

③或者这样
12345678910111213  Binlog found at /log/mysql, up to mysql-bin.000009
Tue Apr  5 22:43:55 2016 - [info] Binlog setting check done.
Tue Apr  5 22:43:55 2016 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Tue Apr  5 22:43:55 2016 - [info]  Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.1.248 --slave_ip=192.168.1.248 --slave_port=4008 --workdir=/usr/local/mha --target_version=5.6.16-log --manager_version=0.57 --relay_log_info=/data/mysql/relay-log.info  --relay_dir=/data/mysql/  --slave_pass=xxx
Tue Apr  5 22:43:55 2016 - [info]  Connecting to [email protected](192.168.1.248:22).. 
mysqlbinlog: unknown variable 'default-character-set=utf8'
mysqlbinlog version command failed with rc 7:0, please verify PATH, LD_LIBRARY_PATH, and client options
 at /usr/local/bin/apply_diff_relay_logs line 493
Tue Apr  5 22:43:55 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln205] Slaves settings check failed!
Tue Apr  5 22:43:55 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln413] Slave configuration failed.
Tue Apr  5 22:43:55 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/local/bin/masterha_check_repl line 48
Tue Apr  5 22:43:55 2016 - [error][/usr/local/lib64/perl5/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Tue Apr  5 22:43:55 2016 - [info] Got exit code 1 (Not master dead).

解决方案
注释掉my.cnf中的
[client]
#default-character-set=utf8

遇到上述错误别慌张,看日志,根据报错来排查问题。

折腾半天,就为了这个OK

[root@HE4 mha4mysql-manager-0.57]# masterha_check_status --conf=/etc/mha/mha.conf
mha (pid:32726) is running(0:PING_OK), master:192.168.1.250

MaxScale2.0
Part1:写在最前
maxscale是mariadb公司开发的一套数据库中间件,可以很方便的实现读写分离方案;并且提供了读写分离的负载均衡和高可用性保障。另外maxscale对于前端应用而言是透明的,我们可以很方便的将应用迁移到maxscale中实现读写分离方案,来分担主库的压力。maxscale也提供了sql语句的解析过滤功能。这里我们主要讲解maxscale的安装、配置以及注意事项。

Part2:整体架构

[root@HE3 ~]# /etc/init.d/maxscale start
Starting MaxScale: maxscale (pid 28851) is running...      [  OK  ]
 
[root@HE3 ~]#  netstat -lntp |grep maxscale
tcp        0      0 0.0.0.0:6603                0.0.0.0:*                  LISTEN      29878/maxscale   
tcp        0      0 0.0.0.0:4006                0.0.0.0:*                  LISTEN      29878/maxscale   

[root@HE3 ~]# maxadmin -pmariadb list services
Services.
--------------------------+----------------------+--------+---------------
Service Name              | Router Module        | #Users | Total Sessions
--------------------------+----------------------+--------+---------------
Read-Only Service        | readconnroute        |      1 |    5
Read-Write Service        | readwritesplit      |      1 |    11
MaxAdmin Service          | cli                  |      2 |    3
--------------------------+----------------------+--------+---------------
 
[root@HE3 ~]# maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port  | Connections | Status           
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  |  3306 |          0 | Slave, Running
server2            | 192.168.1.249  |  3306 |          0 | Slave, Running
server3            | 192.168.1.250  |  3306 |          0 | Master, Running
-------------------+-----------------+-------+-------------+--------------------

至此,MHA+Maxscale的环境就完成了。

读写分离与负载均衡校验
Part1:读写分离
在Master构建相应的测试表
1234567 mysql> select * from helei;
+--------+
| a      |
+--------+
| HE3    |
+--------+
3 rows in set (0.00 sec)

在slave1插入数据HE1

mysql> select * from helei;
+------+
| a    |
+------+
| HE3  |
| HE1  |
+------+

在slave2插入数据HE2
1234567 mysql> select * from helei;
+------+
| a    |
+------+
| HE3  |
| HE2  |
+------+

现在链接4006读写分离端口,进行数据写入
[root@HE3 ~]#  mysql -h192.168.1.250 -P 4006 -usys_admin -pMANAGER maxscale -e"insert into helei values('写入');"
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@HE3 ~]#  mysql -h192.168.1.250 -P 4008 -usys_admin -pMANAGER maxscale -e"select * from helei;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------+
| a      |
+--------+
| HE3    |
| HE1    |
| 写入  |
+--------+

可以看到主库插入完毕后从库已经同步完成,这条查询完成在了HE1(slave1)上

Part2:负载均衡
我们配置的read比例为1:1
[root@HE3 ~]# maxadmin -pmariadb show service "Read-Write Service"
Service 0xef5570
Service:                            Read-Write Service
Router:                              readwritesplit (0x7ff5e8fa6ec0)
State:                              Started
Number of router sessions:          15
Current no. of router sessions:      0
Number of queries forwarded:          41
Number of queries forwarded to master:2 (4.88%)
Number of queries forwarded to slave: 39 (95.12%)
Number of queries forwarded to all:  0 (0.00%)
Connection distribution based on myweight server parameter.
Server              Target %    Connections  Operations
                              Global  Router
server1              50.0%    0      0      0
server2              50.0%    0      0      0
server3              100.0%    0      0      0
Started:                            Thu Nov  3 23:46:27 2016
Root user access:                    Enabled
Backend databases:
192.168.1.248:3306  Protocol: MySQLBackend
192.168.1.249:3306  Protocol: MySQLBackend
192.168.1.100:3306  Protocol: MySQLBackend
Routing weight parameter:            myweight
Users data:                          0xf09370
Total connections:                  16
Currently connected:                1

[root@HE3 ~]# for i in `seq 1 10`; do mysql -h 192.168.1.250 -P 4006 -usys_admin -pMANAGER maxscale -e "select @@hostname; select sleep(10)" 2>/dev/null & done
[root@HE3 ~]# +------------+
| @@hostname |
+------------+
| HE1        |
+------------+
+------------+
| @@hostname |
+------------+
| HE1        |
+------------+
+------------+
| @@hostname |
+------------+
| HE1        |
+------------+
+------------+
| @@hostname |
+------------+
| HE1        |
+------------+
+------------+
| @@hostname |
+------------+
| HE1        |
+------------+
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+

OK!That's how it works!~

 
可以看出,在slave1故障后,所有的读操作都进入了HE2(slave2);

恢复HE1 
mysql> start
slave;
Query OK, 0 rows
affected (0.00 sec)
 
 
[root@HE3 ~]#
maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port 
| Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  | 
3306 |          0 | Slave,
Running
server2            | 192.168.1.249  | 
3306 |          0 | Slave,
Running
server3            | 192.168.1.250  | 
3306 |          0 | Master,
Running
-------------------+-----------------+-------+-------------+--------------------
 
验证
[root@HE3 ~]#  mysql -h192.168.1.250 -P 4006 -usys_admin
-pMANAGER maxscale -e"select * from helei;"
mysql: [Warning]
Using a password on the command line interface can be insecure.
+--------+
| a      |
+--------+
| HE3    |
| HE2    |
| 写入 
|
+--------+
[root@HE3 ~]#  mysql -h192.168.1.250 -P 4006 -usys_admin
-pMANAGER maxscale -e"select * from helei;"
mysql: [Warning]
Using a password on the command line interface can be insecure.
+--------+
| a      |
+--------+
| HE3    |
| HE1    |
| 写入 
|
+--------+

在HE1(slave1)恢复完成后,重新有了负载均衡。
 
Part2:mysql down故障
[root@HE1 ~]#
/etc/init.d/mysqld stop
Shutting down
MySQL..... SUCCESS!
 
停止HE1(slave)可以看到转发到了HE2上
 
[root@HE3 ~]#  maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port 
| Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  | 
3306 |          0 | Down
server2            | 192.168.1.249  | 
3306 |          1 | Slave,
Running
server3            | 192.168.1.250  | 
3306 |          1 | Master,
Running
-------------------+-----------------+-------+-------------+--------------------
[root@HE3
~]# mysql -h192.168.1.250 -P 4006 -usys_admin -pMANAGER
mysql: [Warning]
Using a password on the command line interface can be insecure.
Welcome to the MySQL
monitor.  Commands end with ; or \g.
Your MySQL
connection id is 28948
Server version:
5.5.5-10.0.0 2.0.1-maxscale MySQL Community Server (GPL)
 
Copyright (c) 2000,
2016, Oracle and/or its affiliates. All rights reserved.
 
Oracle is a
registered trademark of Oracle Corporation and/or its
affiliates. Other
names may be trademarks of their respective
owners.
 
Type 'help;' or '\h'
for help. Type '\c' to clear the current input statement.
 
mysql> select
@@hostname;
+------------+
| @@hostname |
+------------+
| HE2        |
+------------+
1 row in set (0.00
sec)
 
 
停掉2台slave,观察maxscale的状态
[root@HE3 ~]#
maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port 
| Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  | 
3306 |          0 | Running
server2            | 192.168.1.249  | 
3306 |          0 | Running
server3            | 192.168.1.250  | 
3306 |          0 | Master, Stale
Status, Running
-------------------+-----------------+-------+-------------+--------------------

Warning:警告这里我并没有在
[MySQL Monitor]中配置detect_stale_master=true
可以看出,在maxscale2.0中,已经默认从库都停掉,也不影响

Part3:master故障
[root@HE3 ~]# maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port  | Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  |  3306 |          0 | Slave, Running
server2            | 192.168.1.249  |  3306 |          0 | Slave, Running
server3            | 192.168.1.100  |  3306 |          0 | Master, Running
-------------------+-----------------+-------+-------------+--------------------
[root@HE3 ~]# ps -ef|grep mysql
root    27709    1  0 Nov03 ?        00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/data/mysql --pid-file=/data/mysql/HE3.pid
mysql    28415 27709  0 Nov03 ?        00:00:19 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/data/mysql/error.log --open-files-limit=8192 --pid-file=/data/mysql/HE3.pid --socket=/tmp/mysql.sock --port=3306
root    30794 28966  0 02:34 pts/1    00:00:00 grep mysql
[root@HE3 ~]# kill -9 28415 27709
[root@HE3 ~]# maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port  | Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  |  3306 |          0 | Slave, Running
server2            | 192.168.1.249  |  3306 |          0 | Slave, Running
server3            | 192.168.1.100  |  3306 |          0 | Master, Running
-------------------+-----------------+-------+-------------+--------------------
[root@HE3 ~]# maxadmin -pmariadb list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server            | Address        | Port  | Connections | Status             
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.1.248  |  3306 |          0 | Master, Running
server2            | 192.168.1.249  |  3306 |          0 | Slave, Running
server3            | 192.168.1.100  |  3306 |          0 | Running
-------------------+-----------------+-------+-------------+--------------------

可以看出,kill master的mysql进程后,MHA将slave1拉成新的master,Maxsale也识别了这一状态。

另:maxscale1.4.3有时还会出现挂起现象,4006端口直接关闭。后我们用了watch命令后台跑监控。
watch -d /bin/bash /root/sh/restart_maxsacle.sh

#!/bin/bash
 
netstat -ntlp | grep maxscale | grep 4006 > /dev/null 2>&1
 
if [ $? -eq 1 ];then
/etc/init.d/maxscale start
fi

解决:5.6以下可以直接在url上配置:autoReconnect=true
对于5.6以上的只能在jdbc的链接池里设置:

第一个设置成ture,超时自动链接,对于mysql要把第二个设置成false,不启用缓存。
以上是开发需要注意的都是一些基本的配置。
 
其目的为:每次归还连接时执行select 'x'检测连接是否有效。

——总结——
maxscale的核心就在于内个配置文件,根据不同的业务和架构,做好相应的配置,本文意在抛砖引玉,如果您有更全的配置文件,欢迎您分享给笔者。由于笔者的水平有限,编写时间也很仓促,文中难免会出现一些错误或者不准确的地方,不妥之处恳请读者批评指正。

猜你喜欢

转载自www.linuxidc.com/Linux/2016-11/136969.htm