双机热备是指两台机器都在运行,但并不是两台机器都同时在提供服务。当提供服务的一台出现故障的时候,另外一台会马上自动接管并且提供服务,而且切换的时间非常短。MySQL双主复制,即互为Master-Slave(只有一个Master提供写操作),可以实现数据库服务器的热备,但是一个Master宕机后不能实现动态切换。使用Keepalived可以通过虚拟IP,实现双主对外的统一接口以及自动检查、失败切换机制,从而实现MySQL数据库的高可用方案。
本文参考文档:https://blog.51cto.com/13858192/2175265
0、搭建需求
(1)先实施Master->Slave的主主同步。主主是数据双向同步,主从是数据单向同步。一般情况下,主库宕机后,需要手动将连接切换到从库上。(但是用keepalived就可以自动切换)
(2)再结合Keepalived的使用,通过VIP实现MySQLl双主对外连接的统一接口。即客户端通过VIP连接数据库;当其中一台宕机后,VIP会漂移到另一台上,这个过程对于客户端的数据连接来说几乎无感觉,从而实现高可用。
角色 | IP | 系统及所需服务 |
---|---|---|
Master1 | 172.20.60.8 | centos7 mysql keepalived |
Master2 | 172.20.60.11 | centos7 mysql keepalived |
VIP | 172.20.60.199 |
注意:防火墙与SELINUX确保已经关闭
1、Master1和Master2都安装好MySQL
步骤参照【Linux安装笔记十一】
2、MySQL主主同步环境部署
(1)在Master1上操作如下
在my.cnf文件的[mysqld]配置区域添加下面内容:
server-id = 1
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 1
slave-skip-errors = all
重启mysql服务
service mysqld restart
数据同步授权,这样I/O线程就可以以这个用户的身份连接到主服务器,并且读取它的二进制日志。
mysql> grant replication slave,replication client on *.* to cylian@'%' identified by "cylian";
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> flush privileges;
//刷新权限
Query OK, 0 rows affected (0.00 sec)
mysql> flush tables with read lock;
//最好将库锁住,仅仅允许读,以保证数据一致性;待主主同步环境部署后再解锁;
锁住后,就不能往表里写数据,但是重启mysql服务后就会自动解锁!
Query OK, 0 rows affected (0.00 sec)
mysql> show master status;
//log bin日志和pos值位置
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 612 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
(2)在Master2上操作如下
在my.cnf文件的[mysqld]配置区域添加下面内容:
server-id = 2
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 2
slave-skip-errors = all
重启mysql服务
service mysqld restart
数据同步授权,这样I/O线程就可以以这个用户的身份连接到主服务器,并且读取它的二进制日志。
mysql> grant replication slave,replication client on *.* to cylian@'%' identified by "cylian";
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000004 | 150 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
(3)在master1上做同步操作
mysql> unlock tables; //先解锁,将对方数据同步到自己的数据库中
mysql> stop slave;
mysql> change master to master_host='172.20.60.11',master_user='cylian',master_password='cylian',master_log_file='mysql-bin.000004',master_log_pos=150;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
查看同步状态,如下出现两个“Yes”,表明同步成功!
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.24.130
Master_User: doudou
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 150
Relay_Log_File: linfan-relay-bin.000002
Relay_Log_Pos: 312
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 150
Relay_Log_Space: 512
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: dc702f48-b7b9-11e8-9caa-000c298fc02c
Master_Info_File: /opt/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
(4)在master2上做同步操作:
mysql> unlock tables; //先解锁,将对方数据同步到自己的数据库中
mysql> stop slave;
mysql> change master to master_host='172.20.60.8',master_user='cylian',master_password='cylian',master_log_file='mysql-bin.000001',master_log_pos=612;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
查看同步状态,如下出现两个“Yes”,表明同步成功!
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.24.130
Master_User: doudou
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 150
Relay_Log_File: linfan-relay-bin.000002
Relay_Log_Pos: 312
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 150
Relay_Log_Space: 512
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: dc702f48-b7b9-11e8-9caa-000c298fc02c
Master_Info_File: /opt/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
4、主主同步效果验证
(1)在master1服务器的数据库写入数据:
mysql> create database tom;
Query OK, 1 row affected (0.01 sec)
mysql> use tom;
Database changed
mysql> create table mary(id int,name varchar(100) not null,age tinyint);
Query OK, 0 rows affected (0.06 sec)
mysql> insert mary values(1,"lisi",10),(2,"zhangshan",28),(3,"wangwu",18);
Query OK, 3 rows affected (0.11 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select * from mary;
+------+-----------+------+
| id | name | age |
+------+-----------+------+
| 1 | lisi | 10 |
| 2 | zhangshan | 28 |
| 3 | wangwu | 18 |
+------+-----------+------+
3 rows in set (0.00 sec)
然后在master2数据库上查看,发现数据已经同步过来了!
(2)在master2数据库上写入新数据
mysql> insert mary values(4,"zhaosi",66),(5,"lida",88);
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> select * from mary;
+------+-----------+------+
| id | name | age |
+------+-----------+------+
| 1 | lisi | 10 |
| 2 | zhangshan | 28 |
| 3 | wangwu | 18 |
| 4 | zhaosi | 66 |
| 5 | lida | 88 |
+------+-----------+------+
5 rows in set (0.00 sec)
然后在master1数据库上查看,发现数据也已经同步过来了!
Mysql主主同步环境已经实现。
5、配置MySQL+Keepalived故障转移的高可用环境
(1)两台机器分别安装Keepalived
[root@localhost ~] yum -y install keepalived
(2)Master1机器上的keepalived.conf配置
拷贝备份配置文件
[root@localhost ~] cd /etc/keepalived
[root@localhost ~] cp keepalived.conf keepalived.conf.backup
编辑修改keepalived.conf(下面配置中没有使用lvs的负载均衡功能,所以不需要配置虚拟服务器virtual server)
! Configuration File for keepalived
global_defs {
notification_email {
ops@wangshibo.cn
tech@wangshibo.cn
}
notification_email_from ops@wangshibo.cn
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id MASTER-HA
}
vrrp_script chk_mysql_port {
script "/opt/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state MASTER
interface enp2s0f0 #指定虚拟ip的网卡接口
mcast_src_ip 172.20.60.8 #本机ip
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须一致
priority 101 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.20.60.199 #VIP
}
track_script {
chk_mysql_port
}
}
编写切换脚本。KeepAlived做心跳检测,如果Master的MySQL服务挂了(3306端口挂了),那么它就会选择自杀。Slave的KeepAlived通过心跳检测发现这个情况,就会将VIP的请求接管
vim /opt/chk_mysql.sh
#!/bin/bash
counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
if [ "${counter}" -eq 0 ]; then
systemctl stop keepalived
fi
(2)Master2机器上的keepalived.conf配置
master2机器上的keepalived.conf文件只修改priority为90、nopreempt不设置、real_server设置本地IP。
! Configuration File for keepalived
global_defs {
notification_email {
ops@wangshibo.cn
tech@wangshibo.cn
}
notification_email_from ops@wangshibo.cn
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id MASTER-HA
}
vrrp_script chk_mysql_port {
script "/opt/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state MASTER
interface enp2s0f0 #指定虚拟ip的网卡接口
mcast_src_ip 172.20.60.11 #本机ip
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须一致
priority 99 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.20.60.199 #VIP
}
track_script {
chk_mysql_port
}
}
6、分别启动Keepalived
[root@localhost ~] systemctl start keepalived
[root@localhost ~] systemctl status keepalived
#查看系统日志
[root@localhost ~] tail -f /var/log/messages
7、高可用测试
(1)通过Mysql客户端通过VIP连接,看是否连接成功
(2)默认情况下,VIP是在master1上的。使用"ip addr"命令查看VIP切换情况
[root@localhost opt]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether c4:b8:b4:5d:7b:a0 brd ff:ff:ff:ff:ff:ff
inet 172.20.60.8/24 brd 172.20.60.255 scope global noprefixroute enp2s0f0
valid_lft forever preferred_lft forever
inet 172.20.60.199/32 scope global enp2s0f0 //这个32位子网掩码的VIP地址表示该资源目前还在Master1机器上
valid_lft forever preferred_lft forever
inet6 fe80::1d46:a502:f064:c2e2/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp2s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether c4:b8:b4:5d:7b:a1 brd ff:ff:ff:ff:ff:ff
停止master1机器上的mysql服务,根据配置中的脚本,mysql服务停了,keepalived也会停,从而VIP资源将会切换到master2机器上。(mysql服务没有起来的时候,keepalived服务也无法顺利启动!)
此时Master1上通过ip addr发现这个32位子网掩码的VIP地址没有了,而Master2上有了ip addr32位子网掩码的VIP,证明切换过来了。