MySQL半同步 + MHA架构配置文档

拓扑介绍:

192.168.43.141  Master

192.168.43.142  Slave01

192.168.43.143  Slave02

192.168.43.144  Mha-manager

                  vip   (192.168.43.145)   vip

            Master(主) --> Slave01(从、备主)

              |                                                                <- Mha-manager(monitor-MHA)   

              |- - -> Slave02(从)


OS    : CentOS release 6.10 (Final) Mini

Perl  : v5.10.1 x86_64-linux-thread-multi(系统版本自带)

Gateway : 192.168.43.1

MySQL   : mysql-5.7.31-1.el6.x86_64.rpm-bundle.tar

MHA       : mha4mysql-manager-0.56-0.el6.noarch.rpm

                   mha4mysql-node-0.56-0.el6.noarch.rpm

MHA官方github         : https://github.com/yoshinorim/mha4mysql-manager/wiki

MySQL主从复制模式: 我们选用半同步机制


一、系统调整(所有节点)

1.修改/etc/hosts

[root@mha-manager ~]# cat /etc/hosts

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.43.141  Master

192.168.43.142  Slave01

192.168.43.143  Slave02

192.168.43.144  Mha-manager

2.关闭防火墙及SELINUX

[root@mha-manager ~]# /etc/init.d/iptables stop

[root@mha-manager ~]# getenforce

Disabled

3.安装基本工具

[root@mha-manager ~]# yum install ntp vim wget net-tools mlocate openssh-* cc gcc make -y


二、安装MHA所需依赖(所有节点)

1.配置SSH免密登录

[root@ ~]# ssh-keygen -t rsa

[root@ ~]# for i in Master Slave01 Slave02 Mha-manager;do ssh-copy-id $i;done

2.测试免密登录

[root@ ~]# for i in Master Slave01 Slave02 Mha-manager;do ssh $i hostname;done

master

slave01

slave02

mha-manager

3.配置NTP服务器

[root@slave01 opt]# yum install ntp -y

[root@slave01 opt]# cat /etc/ntp.conf | grep 'server'

server ntp.aliyun.com

server ntp2.aliyun.com

[root@slave01 opt]# ntpdate ntp.aliyun.com

20 Aug 13:34:06 ntpdate[31767]: step time server 203.107.6.88 offset -1920.602438 sec

[root@slave01 opt]# date

Thu Aug 20 13:34:09 CST 2020

[root@slave01 opt]# /etc/init.d/ntpd start

Starting ntpd:                                            [  OK  ]

[root@slave01 opt]# chkconfig ntpd on

[root@slave01 opt]#


三、基于半同步的Mysql主从复制的安装

(一)半同步插件的安装与开启

0.安装mysql(所有数据库节点)

[root@slave01 opt]# tar -xf mysql-5.7.31-1.el6.x86_64.rpm-bundle.tar

[root@slave01 opt]# yum install -y mysql-community-*

1.启动MySQL,改修数据库root密码(所有数据库节点)

[root@master ~]# /etc/init.d/mysqld start

Initializing MySQL database:                              [  OK  ]

Starting mysqld:                                          [  OK  ]

mysql> alter user 'root'@'localhost' identified by 'Test.123';

Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;

Query OK, 0 rows affected (0.00 sec)

2.安装半同步插件(所有数据库节点)

(1)查找半同步所需插件

mysql> show variables like '%plugin_dir%';  #查找插件所在目录(每台DB服务器可能不一样)

+---------------+--------------------------+

| Variable_name | Value                    |

+---------------+--------------------------+

| plugin_dir    | /usr/lib64/mysql/plugin/ |

+---------------+--------------------------+

1 row in set (0.00 sec)

mysql> show variables like '%have_dynamic%';  #检查是否支持动态检测

+----------------------+-------+

| Variable_name        | Value |

+----------------------+-------+

| have_dynamic_loading | YES  |

+----------------------+-------+

1 row in set (0.01 sec)

mysql> system ls  /usr/lib64/mysql/plugin/ | egrep 'master|slave'

semisync_master.so

semisync_slave.so

(2)安装插件

mysql> install plugin rpl_semi_sync_master SONAME 'semisync_master.so';

Query OK, 0 rows affected (0.00 sec)

mysql> install plugin rpl_semi_sync_slave SONAME 'semisync_slave.so';

Query OK, 0 rows affected (0.00 sec)

(3)检查Plugin是否已正确安装

mysql> show plugins;

+----------------------------+----------+--------------------+----------------------+---------+

| Name                      | Status  | Type              | Library              | License |

+----------------------------+----------+--------------------+----------------------+---------+

| FEDERATED                  | DISABLED | STORAGE ENGINE    | NULL                | GPL    |

| ngram                      | ACTIVE  | FTPARSER          | NULL                | GPL    |

| validate_password          | ACTIVE  | VALIDATE PASSWORD  | validate_password.so | GPL    |

| rpl_semi_sync_master      | ACTIVE  | REPLICATION        | semisync_master.so  | GPL    |

| rpl_semi_sync_slave        | ACTIVE  | REPLICATION        | semisync_slave.so    | GPL    |

+----------------------------+----------+--------------------+----------------------+---------+

47 rows in set (0.00 sec)

mysql> show variables like '%rpl_semi_sync%';

+-------------------------------------------+------------+

| Variable_name                            | Value      |

+-------------------------------------------+------------+

| rpl_semi_sync_master_enabled              | OFF        |

| rpl_semi_sync_master_timeout              | 10000      |

| rpl_semi_sync_master_trace_level          | 32        |

| rpl_semi_sync_master_wait_for_slave_count | 1          |

| rpl_semi_sync_master_wait_no_slave        | ON        |

| rpl_semi_sync_master_wait_point          | AFTER_SYNC |

| rpl_semi_sync_slave_enabled              | OFF        |

| rpl_semi_sync_slave_trace_level          | 32        |

+-------------------------------------------+------------+

8 rows in set (0.01 sec)

mysql>

3.MySQL配置文件的修改

Master配置:

[root@master ~]# cat /etc/my.cnf

[mysqld]

datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

symbolic-links=0

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

server_id=1

log-bin=mysql-bin

binlog_format=mixed

rpl_semi_sync_master_enabled=1

rpl_semi_sync_master_timeout=1000

rpl_semi_sync_slave_enabled=1

relay_log_purge=0

relay-log = relay-bin

relay-log-index = slave-relay-bin.index

default-storage-engine = INNODB

character-set-server = utf8

collation-server = utf8_general_ci

relay_log_purge = 0

[client]

default-character-set = utf8

Slave01配置:

[root@slave01 ~]# cat /etc/my.cnf

[mysqld]

datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

symbolic-links=0

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

server_id=2

log-bin=mysql-bin

binlog_format=mixed

rpl_semi_sync_master_enabled=1

rpl_semi_sync_master_timeout=1000

rpl_semi_sync_slave_enabled=1

relay_log_purge=0

relay-log = relay-bin

relay-log-index = slave-relay-bin.index

default-storage-engine = INNODB

character-set-server = utf8

collation-server = utf8_general_ci

relay_log_purge = 0

Slave02配置:

#由于slave02只是用来做一个slave主机,所以无需开启master的半同步

[root@slave02 ~]# cat /etc/my.cnf

[mysqld]

datadir=/var/lib/mysql

socket=/var/lib/mysql/mysql.sock

symbolic-links=0

log-error=/var/log/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

server_id=3

log-bin = mysql-bin

relay-log = relay-bin

relay-log-index = slave-relay-bin.index

read_only = 1

rpl_semi_sync_slave_enabled=1

default-storage-engine = INNODB

character-set-server = utf8

collation-server = utf8_general_ci

relay_log_purge = 0

4.查看半同步状态是否开启

[root@ ~]# /etc/init.d/mysqld restart

Initializing MySQL database:                              [  OK  ]

Starting mysqld:                                          [  OK  ]

mysql> show variables like '%rpl_semi_sync%';

+-------------------------------------------+------------+

| Variable_name                            | Value      |

+-------------------------------------------+------------+

| rpl_semi_sync_master_enabled              | ON        |

| rpl_semi_sync_master_timeout              | 1000      |

| rpl_semi_sync_master_trace_level          | 32        |

| rpl_semi_sync_master_wait_for_slave_count | 1          |

| rpl_semi_sync_master_wait_no_slave        | ON        |

| rpl_semi_sync_master_wait_point          | AFTER_SYNC |

| rpl_semi_sync_slave_enabled              | ON        |

| rpl_semi_sync_slave_trace_level          | 32        |

+-------------------------------------------+------------+

8 rows in set (0.01 sec)

(二)配置主从复制

1.创建用户(Master上配置)

(1)插件主从复制同步用户

mysql> grant replication slave on *.* to mharep@'192.168.43.%' identified by 'Test.123';

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

  Level: Warning

  Code: 1287

Message: Using GRANT statement to modify existing user's properties other than privileges is deprecated and will be removed in future release. Use ALTER USER statement for this operation.

1 row in set (0.00 sec)

(2)创建用户mha的manager监控的用户

mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings\G;

*************************** 1. row ***************************

  Level: Warning

  Code: 1287

Message: Using GRANT for creating new user is deprecated and will be removed in future release. Create new user with CREATE USER statement.

1 row in set (0.00 sec)

(3)查看master二进制相关的信息

mysql> show master status\G;

*************************** 1. row ***************************

            File: mysql-bin.000001

        Position: 1040

    Binlog_Do_DB:

Binlog_Ignore_DB:

Executed_Gtid_Set:

1 row in set (0.00 sec)

2.创建用户(Slave01上配置)

(1)创建用于同步的用户

mysql> grant replication slave on *.* to mharep@'192.168.43.%' identified by 'Test.123';

Query OK, 0 rows affected, 1 warning (1.00 sec)

(2)创建用户mha的manager监控的用户

mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql>

3.创建用户(Slave02上配置)

由于slave02无需做备主,所以不用创建用于同步数据的账户

#创建manager监控账号

mysql> grant all on *.* to manager@'192.168.43.%' identified by 'Test.123';

Query OK, 0 rows affected, 1 warning (0.01 sec)

4.分别在Slave01与Slave02上执行:

Master上查日志文件和位置:

mysql> show master status\G;

*************************** 1. row ***************************

            File: mysql-bin.000001

        Position: 1040

    Binlog_Do_DB:

Binlog_Ignore_DB:

Executed_Gtid_Set:

1 row in set (0.00 sec)

Slave01上执行:

mysql> change master to master_host='192.168.43.141', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000001', master_log_pos=1040;

Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;

Query OK, 0 rows affected (0.02 sec)

mysql> show slave status\G;

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                  Master_Host: 192.168.43.141

                  Master_User: mharep

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000001

          Read_Master_Log_Pos: 1040

              Relay_Log_File: relay-bin.000002

                Relay_Log_Pos: 320

        Relay_Master_Log_File: mysql-bin.000001

            Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

          Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

          Exec_Master_Log_Pos: 1040

              Relay_Log_Space: 521

              Until_Condition: None

              Until_Log_File:

                Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

              Master_SSL_Key:

        Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

  Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                  Master_UUID: 6882c9b1-e23a-11ea-b7a9-000c29b26880

            Master_Info_File: /var/lib/mysql/master.info

                    SQL_Delay: 0

          SQL_Remaining_Delay: NULL

      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

          Master_Retry_Count: 86400

                  Master_Bind:

      Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

            Executed_Gtid_Set:

                Auto_Position: 0

        Replicate_Rewrite_DB:

                Channel_Name:

          Master_TLS_Version:

1 row in set (0.00 sec)

Slave02上执行:

mysql> change master to master_host='192.168.43.141', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000001', master_log_pos=1040;

Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;

Query OK, 0 rows affected (0.06 sec)

mysql> show slave status\G;

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                  Master_Host: 192.168.43.141

                  Master_User: mharep

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000001

          Read_Master_Log_Pos: 1040

              Relay_Log_File: relay-bin.000002

                Relay_Log_Pos: 320

        Relay_Master_Log_File: mysql-bin.000001

            Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

          Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

          Exec_Master_Log_Pos: 1040

              Relay_Log_Space: 521

              Until_Condition: None

              Until_Log_File:

                Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

              Master_SSL_Key:

        Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

  Replicate_Ignore_Server_Ids:

            Master_Server_Id: 1

                  Master_UUID: 6882c9b1-e23a-11ea-b7a9-000c29b26880

            Master_Info_File: /var/lib/mysql/master.info

                    SQL_Delay: 0

          SQL_Remaining_Delay: NULL

      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

          Master_Retry_Count: 86400

                  Master_Bind:

      Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

            Executed_Gtid_Set:

                Auto_Position: 0

        Replicate_Rewrite_DB:

                Channel_Name:

          Master_TLS_Version:

1 row in set (0.00 sec)

MASTER上查看:

mysql> show status like '%rpl_semi_sync%';

+--------------------------------------------+-------+

| Variable_name                              | Value |

+--------------------------------------------+-------+

| Rpl_semi_sync_master_clients              | 2    |

| Rpl_semi_sync_master_net_avg_wait_time    | 0    |

| Rpl_semi_sync_master_net_wait_time        | 0    |

| Rpl_semi_sync_master_net_waits            | 0    |

| Rpl_semi_sync_master_no_times              | 1    |

| Rpl_semi_sync_master_no_tx                | 3    |

| Rpl_semi_sync_master_status                | ON    |

| Rpl_semi_sync_master_timefunc_failures    | 0    |

| Rpl_semi_sync_master_tx_avg_wait_time      | 0    |

| Rpl_semi_sync_master_tx_wait_time          | 0    |

| Rpl_semi_sync_master_tx_waits              | 0    |

| Rpl_semi_sync_master_wait_pos_backtraverse | 0    |

| Rpl_semi_sync_master_wait_sessions        | 0    |

| Rpl_semi_sync_master_yes_tx                | 0    |

| Rpl_semi_sync_slave_status                | OFF  |

+--------------------------------------------+-------+

15 rows in set (0.02 sec)


四、安装MHA

1.安装mha4mysql-node(所有节点)

以下以slave01为示例:

[root@slave01 mha4mysql]# yum install -y perl-DBD-MySQL

[root@slave01 mha4mysql]# pwd

/opt/mha4mysql

[root@slave01 mha4mysql]# ls

mha4mysql-manager-0.56-0.el6.noarch.rpm  mha4mysql-node-0.56-0.el6.noarch.rpm

[root@slave01 mha4mysql]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

[root@slave01 mha4mysql]# rpm -qa | grep mha

mha4mysql-node-0.56-0.el6.noarch

2.安装mha4mysql-manager(manager节点)

[root@mha-manager opt]# yum install -y perl-DBD-MySQL

[root@mha-manager opt]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

Preparing...                ########################################### [100%]

  1:mha4mysql-node        ########################################### [100%]

[root@mha-manager opt]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

error: Failed dependencies:

        perl(Log::Dispatch) is needed by mha4mysql-manager-0.56-0.el6.noarch

        perl(Log::Dispatch::File) is needed by mha4mysql-manager-0.56-0.el6.noarch

        perl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.56-0.el6.noarch

        perl(Parallel::ForkManager) is needed by mha4mysql-manager-0.56-0.el6.noarch

[root@mha-manager opt]# yum install epel-release.noarch -y

[root@mha-manager opt]# yum install -y perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager install perl-Time-HiRes

[root@mha-manager opt]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

Preparing...                ########################################### [100%]

  1:mha4mysql-manager      ########################################### [100%]

[root@mha-manager opt]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

[root@mha-manager mha4mysql]# rpm -qa | grep mha

mha4mysql-node-0.56-0.el6.noarch

mha4mysql-manager-0.56-0.el6.noarch

3.处理mha4mysql-manager配置文件(manager节点)

[root@mha-manager opt]# mkdir /etc/masterha

[root@mha-manager opt]# mkdir -p /masterha/app1

[root@mha-manager opt]# mkdir /scripts

[root@mha-manager opt]# ls

mha4mysql-manager-0.56-0.el6.noarch.rpm  mha4mysql-manager-0.56.tar.gz  mha4mysql-node-0.56-0.el6.noarch.rpm

[root@mha-manager opt]# tar -xf mha4mysql-manager-0.56.tar.gz

[root@mha-manager opt]# ls

mha4mysql-manager-0.56                  mha4mysql-manager-0.56.tar.gz

mha4mysql-manager-0.56-0.el6.noarch.rpm  mha4mysql-node-0.56-0.el6.noarch.rpm

[root@mha-manager opt]# cd mha4mysql-manager-0.56

[root@mha-manager mha4mysql-manager-0.56]# pwd

/opt/mha4mysql-manager-0.56

[root@mha-manager mha4mysql-manager-0.56]# ls

AUTHORS  bin  COPYING  debian  inc  lib  Makefile.PL  MANIFEST  META.yml  README  rpm  samples  t  tests

[root@mha-manager mha4mysql-manager-0.56]# cp samples/conf/* /etc/masterha/

[root@mha-manager mha4mysql-manager-0.56]# cp samples/scripts/* /scripts/

[root@mha-manager mha4mysql-manager-0.56]# > /etc/masterha/masterha_default.cnf

修改mha-manager配置文件:

[root@mha-manager mha4mysql-manager-0.56]# cat /etc/masterha/app1.cnf

[server default]

manager_workdir=/masterha/app1            #指定工作目录

manager_log=/masterha/app1/manager.log    #指定日志文件

user=manager                              #指定manager管理数据库节点所使用的用户名

password=Test.123                          #对应的是上面用户的密码

ssh_user=root                              #指定配置了ssh免密登录的系统用户

repl_user=mharep                          #指定用于同步数据的用户名

repl_password=Test.123                                  #对应的是上面同步用户的 密码

ping_interval=1                                          #设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应时自动进行切换

master_ip_failover_script=/scripts/master_ip_failover    # 该脚本文件请见附录(文末)

secondary_check_script= masterha_secondary_check -s 192.168.43.141 -s 192.168.43.142 -s 192.168.43.143

[server1]

hostname=192.168.43.141

port=3306

master_binlog_dir=/var/lib/mysql    #指定master保存二进制日志的路径,以便MHA可以找到master的日志

candidate_master=1                  #设置为候选master,设置该参数后,发生主从切换以后将会将此库提升为主库

[server2]

hostname=192.168.43.142

port=3306

master_binlog_dir=/var/lib/mysql

candidate_master=1                  #设置为候选master

[server3]

hostname=192.168.43.143

port=3306

master_binlog_dir=/var/lib/mysql

no_master=1                          #设置的不为备选主库

4.启动前测试

(1)验证SSH有效性

[root@mha-manager mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/masterha/app1.cnf

Can't locate MHA/SSHCheck.pm in @INC (you may need to install the MHA::SSHCheck module) (@INC contains: /usr/local/perl/lib/site_perl/5.20.3/x86_64-linux /usr/local/perl/lib/site_perl/5.20.3 /usr/local/perl/lib/5.20.3/x86_64-linux /usr/local/perl/lib/5.20.3 .) at /usr/bin/masterha_check_ssh line 25.

BEGIN failed--compilation aborted at /usr/bin/masterha_check_ssh line 25.

[root@mha-manager mha4mysql-manager-0.56]# masterha_check_ssh --conf=/etc/masterha/app1.cnf

Thu Aug 20 05:18:17 2020 - [info] Reading default configuration from /etc/masterha/masterha_default.cnf..

Thu Aug 20 05:18:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Thu Aug 20 05:18:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Thu Aug 20 05:18:17 2020 - [info] Starting SSH connection tests..

Thu Aug 20 05:18:18 2020 - [debug]

Thu Aug 20 05:18:17 2020 - [debug]  Connecting via SSH from root@192.168.43.141(192.168.43.141:22) to root@192.168.43.142(192.168.43.142:22)..

Thu Aug 20 05:18:17 2020 - [debug]  ok.

Thu Aug 20 05:18:17 2020 - [debug]  Connecting via SSH from root@192.168.43.141(192.168.43.141:22) to root@192.168.43.143(192.168.43.143:22)..

Thu Aug 20 05:18:18 2020 - [debug]  ok.

Thu Aug 20 05:18:18 2020 - [debug]

Thu Aug 20 05:18:18 2020 - [debug]  Connecting via SSH from root@192.168.43.142(192.168.43.142:22) to root@192.168.43.141(192.168.43.141:22)..

Thu Aug 20 05:18:18 2020 - [debug]  ok.

Thu Aug 20 05:18:18 2020 - [debug]  Connecting via SSH from root@192.168.43.142(192.168.43.142:22) to root@192.168.43.143(192.168.43.143:22)..

Thu Aug 20 05:18:18 2020 - [debug]  ok.

Thu Aug 20 05:18:19 2020 - [debug]

Thu Aug 20 05:18:18 2020 - [debug]  Connecting via SSH from root@192.168.43.143(192.168.43.143:22) to root@192.168.43.141(192.168.43.141:22)..

Thu Aug 20 05:18:18 2020 - [debug]  ok.

Thu Aug 20 05:18:18 2020 - [debug]  Connecting via SSH from root@192.168.43.143(192.168.43.143:22) to root@192.168.43.142(192.168.43.142:22)..

Thu Aug 20 05:18:19 2020 - [debug]  ok.

Thu Aug 20 05:18:19 2020 - [info] All SSH connection tests passed successfully.

[root@mha-manager mha4mysql-manager-0.56]#

(2)验证集群复制的有效性(MySQL必须都启动)

[root@mha-manager mha4mysql-manager-0.56]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Thu Aug 20 15:50:32 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Aug 20 15:50:32 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Thu Aug 20 15:50:32 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Thu Aug 20 15:50:32 2020 - [info] MHA::MasterMonitor version 0.56.

Thu Aug 20 15:50:33 2020 - [info] GTID failover mode = 0

Thu Aug 20 15:50:33 2020 - [info] Dead Servers:

Thu Aug 20 15:50:33 2020 - [info] Alive Servers:

Thu Aug 20 15:50:33 2020 - [info]  192.168.43.141(192.168.43.141:3306)

Thu Aug 20 15:50:33 2020 - [info]  192.168.43.142(192.168.43.142:3306)

Thu Aug 20 15:50:33 2020 - [info]  192.168.43.143(192.168.43.143:3306)

Thu Aug 20 15:50:33 2020 - [info] Alive Slaves:

Thu Aug 20 15:50:33 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 15:50:33 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 15:50:33 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 15:50:33 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 15:50:33 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 15:50:33 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 15:50:33 2020 - [info] Current Alive Master: 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 15:50:33 2020 - [info] Checking slave configurations..

Thu Aug 20 15:50:33 2020 - [info]  read_only=1 is not set on slave 192.168.43.142(192.168.43.142:3306).

Thu Aug 20 15:50:33 2020 - [warning]  relay_log_purge=0 is not set on slave 192.168.43.143(192.168.43.143:3306).

Thu Aug 20 15:50:33 2020 - [info] Checking replication filtering settings..

Thu Aug 20 15:50:33 2020 - [info]  binlog_do_db= , binlog_ignore_db=

Thu Aug 20 15:50:33 2020 - [info]  Replication filtering check ok.

Thu Aug 20 15:50:33 2020 - [info] GTID (with auto-pos) is not supported

Thu Aug 20 15:50:33 2020 - [info] Starting SSH connection tests..

Thu Aug 20 15:50:34 2020 - [info] All SSH connection tests passed successfully.

Thu Aug 20 15:50:34 2020 - [info] Checking MHA Node version..

Thu Aug 20 15:50:35 2020 - [info]  Version check ok.

Thu Aug 20 15:50:35 2020 - [info] Checking SSH publickey authentication settings on the current master..

Thu Aug 20 15:50:35 2020 - [info] HealthCheck: SSH to 192.168.43.141 is reachable.

Thu Aug 20 15:50:35 2020 - [info] Master MHA Node version is 0.56.

Thu Aug 20 15:50:35 2020 - [info] Checking recovery script configurations on 192.168.43.141(192.168.43.141:3306)..

Thu Aug 20 15:50:35 2020 - [info]  Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000001

Thu Aug 20 15:50:35 2020 - [info]  Connecting to root@192.168.43.141(192.168.43.141:22)..

  Creating /var/tmp if not exists..    ok.

  Checking output directory is accessible or not..

  ok.

  Binlog found at /var/lib/mysql, up to mysql-bin.000001

Thu Aug 20 15:50:35 2020 - [info] Binlog setting check done.

Thu Aug 20 15:50:35 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Aug 20 15:50:35 2020 - [info]  Executing command : apply_diff_relay_logs --command=test --slave_user='manager' --slave_host=192.168.43.142 --slave_ip=192.168.43.142 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Thu Aug 20 15:50:35 2020 - [info]  Connecting to root@192.168.43.142(192.168.43.142:22)..

  Checking slave recovery environment settings..

    Opening /var/lib/mysql/relay-log.info ... ok.

    Relay log found at /var/lib/mysql, up to relay-bin.000004

    Temporary relay log file is /var/lib/mysql/relay-bin.000004

    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

    Testing mysqlbinlog output.. done.

    Cleaning up test file(s).. done.

Thu Aug 20 15:50:35 2020 - [info]  Executing command : apply_diff_relay_logs --command=test --slave_user='manager' --slave_host=192.168.43.143 --slave_ip=192.168.43.143 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Thu Aug 20 15:50:35 2020 - [info]  Connecting to root@192.168.43.143(192.168.43.143:22)..

  Checking slave recovery environment settings..

    Opening /var/lib/mysql/relay-log.info ... ok.

    Relay log found at /var/lib/mysql, up to relay-bin.000004

    Temporary relay log file is /var/lib/mysql/relay-bin.000004

    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

    Testing mysqlbinlog output.. done.

    Cleaning up test file(s).. done.

Thu Aug 20 15:50:36 2020 - [info] Slaves settings check done.

Thu Aug 20 15:50:36 2020 - [info]

192.168.43.141(192.168.43.141:3306) (current master)

+--192.168.43.142(192.168.43.142:3306)

+--192.168.43.143(192.168.43.143:3306)

Thu Aug 20 15:50:36 2020 - [info] Checking replication health on 192.168.43.142..

Thu Aug 20 15:50:36 2020 - [info]  ok.

Thu Aug 20 15:50:36 2020 - [info] Checking replication health on 192.168.43.143..

Thu Aug 20 15:50:36 2020 - [info]  ok.

Thu Aug 20 15:50:36 2020 - [warning] master_ip_failover_script is not defined.

Thu Aug 20 15:50:36 2020 - [warning] shutdown_script is not defined.

Thu Aug 20 15:50:36 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

5.启动manager

[root@mha-manager masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log


五、管理MHA

1.FailOver的配置

MHA管理VIP有两种方案,一种是使用Keepalived,另一种是自己写命令实现增删VIP,由于Keepalived容易受到网络波动造成VIP切换,而且无法在多实例机器上使用,所以建议写脚本管理VIP。

使用脚本管理 VIP 不会自动设置 VIP,所以先手动在 Master 设置 VIP

这里我们先在Master上设置我们的vip:192.168.43.145

ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255

[root@master opt]# ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255

[root@master opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:feb2:6880/64 scope link

      valid_lft forever preferred_lft forever

[root@master opt]#

2.编写master_ip_failover(见附录)

3.分析现在的拓扑结构以便后续测试的理解:

[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

                  Master_Host: 192.168.43.141

[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

                  Master_Host: 192.168.43.141

[root@slave02 opt]#

可以看出当前的拓扑结构:

Master      -> (192.168.43.141)

Slave      -> (192.168.43.142,192.168.43.143)

vip        -> 192.168.43.145(现在master上,failover后应该漂移到slave01上)

Mha-manager -> 192.168.43.144

4.开始模拟Master切换过程

(1)确保Mha-manager上的监控进程正常运行

[root@mha-manager masterha]# ps -ef | grep perl

root      2514  1604  1 18:32 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf

root      2573  1604  0 18:33 pts/0    00:00:00 grep perl

(2)查看Master上的IP

[root@master opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:feb2:6880/64 scope link

      valid_lft forever preferred_lft forever

(3)关闭Master上的mysql服务

[root@master opt]# /etc/init.d/mysqld stop

Stopping mysqld:                                          [  OK  ]

[root@master opt]#

(4)观察IP漂移情况

[root@master opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0

    inet6 fe80::20c:29ff:feb2:6880/64 scope link

      valid_lft forever preferred_lft forever

[root@slave01 opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:fefb:7918/64 scope link

      valid_lft forever preferred_lft forever

可以观察:vip已经漂移到slave01上

(5)观察Master角色的转移

[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

[root@slave01 opt]#

[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

                  Master_Host: 192.168.43.142

可以观察:Master角色迁移到slave01上

(6)分析日志输出(下面为切换的Mha-manager上的整个日志输出)

Thu Aug 20 18:37:14 2020 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)

Thu Aug 20 18:37:14 2020 - [info] Executing secondary network check script: masterha_secondary_check -s 192.168.43.141 -s 192.168.43.142 -s 192.168.43.143  --user=root  --master_host=192.168.43.141  --master_ip=192.168.43.141  --master_port=3306 --master_user=manager --master_password=Test.123 --ping_type=SELECT

Thu Aug 20 18:37:14 2020 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mysql-bin

Thu Aug 20 18:37:14 2020 - [info] HealthCheck: SSH to 192.168.43.141 is reachable.

Monitoring server 192.168.43.141 is reachable, Master is not reachable from 192.168.43.141. OK.

Monitoring server 192.168.43.142 is reachable, Master is not reachable from 192.168.43.142. OK.

Monitoring server 192.168.43.143 is reachable, Master is not reachable from 192.168.43.143. OK.

Thu Aug 20 18:37:14 2020 - [info] Master is not reachable from all other monitoring servers. Failover should start.

Thu Aug 20 18:37:15 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)

Thu Aug 20 18:37:15 2020 - [warning] Connection failed 2 time(s)..

Thu Aug 20 18:37:16 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)

Thu Aug 20 18:37:16 2020 - [warning] Connection failed 3 time(s)..

Thu Aug 20 18:37:17 2020 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)

Thu Aug 20 18:37:17 2020 - [warning] Connection failed 4 time(s)..

Thu Aug 20 18:37:17 2020 - [warning] Master is not reachable from health checker!

Thu Aug 20 18:37:17 2020 - [warning] Master 192.168.43.141(192.168.43.141:3306) is not reachable!

Thu Aug 20 18:37:17 2020 - [warning] SSH is reachable.

Thu Aug 20 18:37:17 2020 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..

Thu Aug 20 18:37:17 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Aug 20 18:37:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Thu Aug 20 18:37:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Thu Aug 20 18:37:18 2020 - [info] GTID failover mode = 0

Thu Aug 20 18:37:18 2020 - [info] Dead Servers:

Thu Aug 20 18:37:18 2020 - [info]  192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:18 2020 - [info] Alive Servers:

Thu Aug 20 18:37:18 2020 - [info]  192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:37:18 2020 - [info]  192.168.43.143(192.168.43.143:3306)

Thu Aug 20 18:37:18 2020 - [info] Alive Slaves:

Thu Aug 20 18:37:18 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:18 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:18 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:37:18 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:18 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:18 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 18:37:18 2020 - [info] Checking slave configurations..

Thu Aug 20 18:37:18 2020 - [info]  read_only=1 is not set on slave 192.168.43.142(192.168.43.142:3306).

Thu Aug 20 18:37:18 2020 - [warning]  relay_log_purge=0 is not set on slave 192.168.43.143(192.168.43.143:3306).

Thu Aug 20 18:37:18 2020 - [info] Checking replication filtering settings..

Thu Aug 20 18:37:18 2020 - [info]  Replication filtering check ok.

Thu Aug 20 18:37:18 2020 - [info] Master is down!

Thu Aug 20 18:37:18 2020 - [info] Terminating monitoring script.

Thu Aug 20 18:37:18 2020 - [info] Got exit code 20 (Master dead).

Thu Aug 20 18:37:18 2020 - [info] MHA::MasterFailover version 0.56.

Thu Aug 20 18:37:18 2020 - [info] Starting master failover.

Thu Aug 20 18:37:18 2020 - [info]

Thu Aug 20 18:37:18 2020 - [info] * Phase 1: Configuration Check Phase..

Thu Aug 20 18:37:18 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] GTID failover mode = 0

Thu Aug 20 18:37:19 2020 - [info] Dead Servers:

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info] Checking master reachability via MySQL(double check)...

Thu Aug 20 18:37:19 2020 - [info]  ok.

Thu Aug 20 18:37:19 2020 - [info] Alive Servers:

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.143(192.168.43.143:3306)

Thu Aug 20 18:37:19 2020 - [info] Alive Slaves:

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 18:37:19 2020 - [info] Starting Non-GTID based failover.

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] ** Phase 1: Configuration Check Phase completed.

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] * Phase 2: Dead Master Shutdown Phase..

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] Forcing shutdown so that applications never connect to the current master..

Thu Aug 20 18:37:19 2020 - [info] Executing master IP deactivation script:

Thu Aug 20 18:37:19 2020 - [info]  /scripts/master_ip_failover --orig_master_host=192.168.43.141 --orig_master_ip=192.168.43.141 --orig_master_port=3306 --command=stopssh --ssh_user=root

VIP Command: start=sudo /sbin/ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Disabling the VIP on old master: 192.168.43.141

Thu Aug 20 18:37:19 2020 - [info]  done.

Thu Aug 20 18:37:19 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.

Thu Aug 20 18:37:19 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] * Phase 3: Master Recovery Phase..

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] The latest binary log file/position on all slaves is mysql-bin.000005:154

Thu Aug 20 18:37:19 2020 - [info] Latest slaves (Slaves that received relay log files to the latest):

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 18:37:19 2020 - [info] The oldest binary log file/position on all slaves is mysql-bin.000005:154

Thu Aug 20 18:37:19 2020 - [info] Oldest slaves:

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:37:19 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:19 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:19 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:19 2020 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..

Thu Aug 20 18:37:19 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] Fetching dead master's binary logs..

Thu Aug 20 18:37:20 2020 - [info] Executing command on the dead master 192.168.43.141(192.168.43.141:3306): save_binary_logs --command=save --start_file=mysql-bin.000005  --start_pos=154 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56

  Creating /var/tmp if not exists..    ok.

Concat binary/relay logs from mysql-bin.000005 pos 154 to mysql-bin.000005 EOF into /var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog ..

Binlog Checksum enabled

  Dumping binlog format description event, from position 0 to 154.. ok.

  No need to dump effective binlog data from /var/lib/mysql/mysql-bin.000005 (pos starts 154, filesize 154). Skipping.

Binlog Checksum enabled

/var/tmp/saved_master_binlog_from_192.168.43.141_3306_20200820183718.binlog has no effective data events.

Event not exists.

Thu Aug 20 18:37:20 2020 - [info] Additional events were not found from the orig master. No need to save.

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] * Phase 3.3: Determining New Master Phase..

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] Finding the latest slave that has all relay logs for recovering other slaves..

Thu Aug 20 18:37:20 2020 - [info] All slaves received relay logs to the same position. No need to resync each other.

Thu Aug 20 18:37:20 2020 - [info] Searching new master from slaves..

Thu Aug 20 18:37:20 2020 - [info]  Candidate masters from the configuration file:

Thu Aug 20 18:37:20 2020 - [info]  192.168.43.142(192.168.43.142:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:20 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:20 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:37:20 2020 - [info]  Non-candidate masters:

Thu Aug 20 18:37:20 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:37:20 2020 - [info]    Replicating from 192.168.43.141(192.168.43.141:3306)

Thu Aug 20 18:37:20 2020 - [info]    Not candidate for the new Master (no_master is set)

Thu Aug 20 18:37:20 2020 - [info]  Searching from candidate_master slaves which have received the latest relay log events..

Thu Aug 20 18:37:20 2020 - [info] New master is 192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:37:20 2020 - [info] Starting master failover..

Thu Aug 20 18:37:20 2020 - [info]

From:

192.168.43.141(192.168.43.141:3306) (current master)

+--192.168.43.142(192.168.43.142:3306)

+--192.168.43.143(192.168.43.143:3306)

To:

192.168.43.142(192.168.43.142:3306) (new master)

+--192.168.43.143(192.168.43.143:3306)

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] * Phase 3.3: New Master Diff Log Generation Phase..

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] * Phase 3.4: Master Log Apply Phase..

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.

Thu Aug 20 18:37:20 2020 - [info] Starting recovery on 192.168.43.142(192.168.43.142:3306)..

Thu Aug 20 18:37:20 2020 - [info]  This server has all relay logs. Waiting all logs to be applied..

Thu Aug 20 18:37:20 2020 - [info]  done.

Thu Aug 20 18:37:20 2020 - [info]  All relay logs were successfully applied.

Thu Aug 20 18:37:20 2020 - [info] Getting new master's binlog name and position..

Thu Aug 20 18:37:20 2020 - [info]  mysql-bin.000002:154

Thu Aug 20 18:37:20 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.43.142', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=154, MASTER_USER='mharep', MASTER_PASSWORD='xxx';

Thu Aug 20 18:37:20 2020 - [info] Executing master IP activate script:

Thu Aug 20 18:37:20 2020 - [info]  /scripts/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.43.141 --orig_master_ip=192.168.43.141 --orig_master_port=3306 --new_master_host=192.168.43.142 --new_master_ip=192.168.43.142 --new_master_port=3306 --new_master_user='manager' --new_master_password='Test.123'

VIP Command: start=sudo /sbin/ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Set read_only=0 on the new master.

Enabling the VIP - 192.168.43.145 on the new master - 192.168.43.142

Thu Aug 20 18:37:20 2020 - [info]  OK.

Thu Aug 20 18:37:20 2020 - [info] ** Finished master recovery successfully.

Thu Aug 20 18:37:20 2020 - [info] * Phase 3: Master Recovery Phase completed.

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] * Phase 4: Slaves Recovery Phase..

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..

Thu Aug 20 18:37:20 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info] -- Slave diff file generation on host 192.168.43.143(192.168.43.143:3306) started, pid: 2783. Check tmp log /masterha/app1/192.168.43.143_3306_20200820183718.log if it takes time..

Thu Aug 20 18:37:21 2020 - [info]

Thu Aug 20 18:37:21 2020 - [info] Log messages from 192.168.43.143 ...

Thu Aug 20 18:37:21 2020 - [info]

Thu Aug 20 18:37:20 2020 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.

Thu Aug 20 18:37:21 2020 - [info] End of log messages from 192.168.43.143.

Thu Aug 20 18:37:21 2020 - [info] -- 192.168.43.143(192.168.43.143:3306) has the latest relay log events.

Thu Aug 20 18:37:21 2020 - [info] Generating relay diff files from the latest slave succeeded.

Thu Aug 20 18:37:21 2020 - [info]

Thu Aug 20 18:37:21 2020 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..

Thu Aug 20 18:37:21 2020 - [info]

Thu Aug 20 18:37:21 2020 - [info] -- Slave recovery on host 192.168.43.143(192.168.43.143:3306) started, pid: 2785. Check tmp log /masterha/app1/192.168.43.143_3306_20200820183718.log if it takes time..

Thu Aug 20 18:37:22 2020 - [info]

Thu Aug 20 18:37:22 2020 - [info] Log messages from 192.168.43.143 ...

Thu Aug 20 18:37:22 2020 - [info]

Thu Aug 20 18:37:21 2020 - [info] Starting recovery on 192.168.43.143(192.168.43.143:3306)..

Thu Aug 20 18:37:21 2020 - [info]  This server has all relay logs. Waiting all logs to be applied..

Thu Aug 20 18:37:21 2020 - [info]  done.

Thu Aug 20 18:37:21 2020 - [info]  All relay logs were successfully applied.

Thu Aug 20 18:37:21 2020 - [info]  Resetting slave 192.168.43.143(192.168.43.143:3306) and starting replication from the new master 192.168.43.142(192.168.43.142:3306)..

Thu Aug 20 18:37:21 2020 - [info]  Executed CHANGE MASTER.

Thu Aug 20 18:37:21 2020 - [info]  Slave started.

Thu Aug 20 18:37:22 2020 - [info] End of log messages from 192.168.43.143.

Thu Aug 20 18:37:22 2020 - [info] -- Slave recovery on host 192.168.43.143(192.168.43.143:3306) succeeded.

Thu Aug 20 18:37:22 2020 - [info] All new slave servers recovered successfully.

Thu Aug 20 18:37:22 2020 - [info]

Thu Aug 20 18:37:22 2020 - [info] * Phase 5: New master cleanup phase..

Thu Aug 20 18:37:22 2020 - [info]

Thu Aug 20 18:37:22 2020 - [info] Resetting slave info on the new master..

Thu Aug 20 18:37:22 2020 - [info]  192.168.43.142: Resetting slave info succeeded.

Thu Aug 20 18:37:22 2020 - [info] Master failover to 192.168.43.142(192.168.43.142:3306) completed successfully.

Thu Aug 20 18:37:22 2020 - [info]

----- Failover Report -----

app1: MySQL Master failover 192.168.43.141(192.168.43.141:3306) to 192.168.43.142(192.168.43.142:3306) succeeded

Master 192.168.43.141(192.168.43.141:3306) is down!

Check MHA Manager logs at mha-manager:/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.

Invalidated master IP address on 192.168.43.141(192.168.43.141:3306)

The latest slave 192.168.43.142(192.168.43.142:3306) has all relay logs for recovery.

Selected 192.168.43.142(192.168.43.142:3306) as a new master.

192.168.43.142(192.168.43.142:3306): OK: Applying all logs succeeded.

192.168.43.142(192.168.43.142:3306): OK: Activated master IP address.

192.168.43.143(192.168.43.143:3306): This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

192.168.43.143(192.168.43.143:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.43.142(192.168.43.142:3306)

192.168.43.142(192.168.43.142:3306): Resetting slave info succeeded.

Master failover to 192.168.43.142(192.168.43.142:3306) completed successfully.

(7)观察manager上的perl开启的监控进程是否存在?

[root@mha-manager ~]# ps -ef | grep perl

root      2813  2796  0 18:41 pts/0    00:00:00 grep perl

观察可知:MHA在切换完成后会结束 Manager 进程

5.还原操作

(1)删除锁文件

MHA每次故障切换后都会生成一个app1.failover.complete这样的文件,如果不加这个参数,需要删除这个文件才能再次启动

[root@mha-manager masterha]# rm -rf  /masterha/app1/app1.failover.complete

或者启动时添加(--ignore_last_failover 忽略上次切换):

nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log --ignore_last_failover &

(2)原MASTER上从新指定slave01为新的master角色服务器

[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show master status\G;'

mysql: [Warning] Using a password on the command line interface can be insecure.

*************************** 1. row ***************************

            File: mysql-bin.000002

        Position: 154

    Binlog_Do_DB:

Binlog_Ignore_DB:

Executed_Gtid_Set:

[root@slave01 opt]#

[root@master opt]# /etc/init.d/mysqld start

Starting mysqld:                                          [  OK  ]

[root@master opt]# mysql -uroot -p'Test.123'

mysql: [Warning] Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 3

Server version: 5.7.31-log MySQL Community Server (GPL)

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

由于演示,这里就不锁库了:

mysql> change master to master_host='192.168.43.142', master_port=3306, master_user='mharep',master_password='Test.123', master_log_file = 'mysql-bin.000002', master_log_pos=154;

Query OK, 0 rows affected, 2 warnings (0.04 sec)

mysql> start slave;

Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;

*************************** 1. row ***************************

              Slave_IO_State: Waiting for master to send event

                  Master_Host: 192.168.43.142

                  Master_User: mharep

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000002

          Read_Master_Log_Pos: 154

              Relay_Log_File: relay-bin.000002

                Relay_Log_Pos: 320

        Relay_Master_Log_File: mysql-bin.000002

            Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

          Replicate_Ignore_DB:

          Replicate_Do_Table:

      Replicate_Ignore_Table:

      Replicate_Wild_Do_Table:

  Replicate_Wild_Ignore_Table:

                  Last_Errno: 0

                  Last_Error:

                Skip_Counter: 0

          Exec_Master_Log_Pos: 154

              Relay_Log_Space: 521

              Until_Condition: None

              Until_Log_File:

                Until_Log_Pos: 0

          Master_SSL_Allowed: No

          Master_SSL_CA_File:

          Master_SSL_CA_Path:

              Master_SSL_Cert:

            Master_SSL_Cipher:

              Master_SSL_Key:

        Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

                Last_IO_Errno: 0

                Last_IO_Error:

              Last_SQL_Errno: 0

              Last_SQL_Error:

  Replicate_Ignore_Server_Ids:

            Master_Server_Id: 2

                  Master_UUID: 0b529b47-e2b6-11ea-9d4e-000c29fb7918

            Master_Info_File: /var/lib/mysql/master.info

                    SQL_Delay: 0

          SQL_Remaining_Delay: NULL

      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

          Master_Retry_Count: 86400

                  Master_Bind:

      Last_IO_Error_Timestamp:

    Last_SQL_Error_Timestamp:

              Master_SSL_Crl:

          Master_SSL_Crlpath:

          Retrieved_Gtid_Set:

            Executed_Gtid_Set:

                Auto_Position: 0

        Replicate_Rewrite_DB:

                Channel_Name:

          Master_TLS_Version:

1 row in set (0.00 sec)

(3)切换MASTER角色(manager上操作)这时不要开启manager进程,即不开启:nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /var/log/mha_manager.log &

[root@mha-manager ~]# ps -ef | grep perl

root      2936  2796  0 18:51 pts/0    00:00:00 grep perl

[root@mha-manager ~]#

[root@mha-manager ~]#

[root@mha-manager ~]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.43.141 --new_master_port=3306 --orig_master_is_new_slave

# 填写: yes

Thu Aug 20 18:51:17 2020 - [info] MHA::MasterRotate version 0.56.

Thu Aug 20 18:51:17 2020 - [info] Starting online master switch..

Thu Aug 20 18:51:17 2020 - [info]

Thu Aug 20 18:51:17 2020 - [info] * Phase 1: Configuration Check Phase..

Thu Aug 20 18:51:17 2020 - [info]

Thu Aug 20 18:51:17 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Aug 20 18:51:17 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..

Thu Aug 20 18:51:17 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..

Thu Aug 20 18:51:18 2020 - [info] GTID failover mode = 0

Thu Aug 20 18:51:18 2020 - [info] Current Alive Master: 192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:51:18 2020 - [info] Alive Slaves:

Thu Aug 20 18:51:18 2020 - [info]  192.168.43.141(192.168.43.141:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:51:18 2020 - [info]    Replicating from 192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:51:18 2020 - [info]    Primary candidate for the new Master (candidate_master is set)

Thu Aug 20 18:51:18 2020 - [info]  192.168.43.143(192.168.43.143:3306)  Version=5.7.31-log (oldest major version between slaves) log-bin:enabled

Thu Aug 20 18:51:18 2020 - [info]    Replicating from 192.168.43.142(192.168.43.142:3306)

Thu Aug 20 18:51:18 2020 - [info]    Not candidate for the new Master (no_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.43.142(192.168.43.142:3306)? (YES/no): yes

Thu Aug 20 18:51:21 2020 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..

Thu Aug 20 18:51:21 2020 - [info]  ok.

Thu Aug 20 18:51:21 2020 - [info] Checking MHA is not monitoring or doing failover..

Thu Aug 20 18:51:21 2020 - [info] Checking replication health on 192.168.43.141..

Thu Aug 20 18:51:21 2020 - [info]  ok.

Thu Aug 20 18:51:21 2020 - [info] Checking replication health on 192.168.43.143..

Thu Aug 20 18:51:21 2020 - [info]  ok.

Thu Aug 20 18:51:21 2020 - [info] 192.168.43.141 can be new master.

Thu Aug 20 18:51:21 2020 - [info]

From:

192.168.43.142(192.168.43.142:3306) (current master)

+--192.168.43.141(192.168.43.141:3306)

+--192.168.43.143(192.168.43.143:3306)

To:

192.168.43.141(192.168.43.141:3306) (new master)

+--192.168.43.143(192.168.43.143:3306)

+--192.168.43.142(192.168.43.142:3306)

Starting master switch from 192.168.43.142(192.168.43.142:3306) to 192.168.43.141(192.168.43.141:3306)? (yes/NO): yes

Thu Aug 20 18:51:24 2020 - [info] Checking whether 192.168.43.141(192.168.43.141:3306) is ok for the new master..

Thu Aug 20 18:51:24 2020 - [info]  ok.

Thu Aug 20 18:51:24 2020 - [info] 192.168.43.142(192.168.43.142:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.

Thu Aug 20 18:51:24 2020 - [info] 192.168.43.142(192.168.43.142:3306): Resetting slave pointing to the dummy host.

Thu Aug 20 18:51:24 2020 - [info] ** Phase 1: Configuration Check Phase completed.

Thu Aug 20 18:51:24 2020 - [info]

Thu Aug 20 18:51:24 2020 - [info] * Phase 2: Rejecting updates Phase..

Thu Aug 20 18:51:24 2020 - [info]

master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes

Thu Aug 20 18:51:27 2020 - [info] Locking all tables on the orig master to reject updates from everybody (including root):

Thu Aug 20 18:51:27 2020 - [info] Executing FLUSH TABLES WITH READ LOCK..

Thu Aug 20 18:51:27 2020 - [info]  ok.

Thu Aug 20 18:51:27 2020 - [info] Orig master binlog:pos is mysql-bin.000002:154.

Thu Aug 20 18:51:27 2020 - [info]  Waiting to execute all relay logs on 192.168.43.141(192.168.43.141:3306)..

Thu Aug 20 18:51:27 2020 - [info]  master_pos_wait(mysql-bin.000002:154) completed on 192.168.43.141(192.168.43.141:3306). Executed 0 events.

Thu Aug 20 18:51:27 2020 - [info]  done.

Thu Aug 20 18:51:27 2020 - [info] Getting new master's binlog name and position..

Thu Aug 20 18:51:27 2020 - [info]  mysql-bin.000006:154

Thu Aug 20 18:51:27 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.43.141', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000006', MASTER_LOG_POS=154, MASTER_USER='mharep', MASTER_PASSWORD='xxx';

Thu Aug 20 18:51:27 2020 - [info]

Thu Aug 20 18:51:27 2020 - [info] * Switching slaves in parallel..

Thu Aug 20 18:51:27 2020 - [info]

Thu Aug 20 18:51:27 2020 - [info] -- Slave switch on host 192.168.43.143(192.168.43.143:3306) started, pid: 2941

Thu Aug 20 18:51:27 2020 - [info]

Thu Aug 20 18:51:28 2020 - [info] Log messages from 192.168.43.143 ...

Thu Aug 20 18:51:28 2020 - [info]

Thu Aug 20 18:51:27 2020 - [info]  Waiting to execute all relay logs on 192.168.43.143(192.168.43.143:3306)..

Thu Aug 20 18:51:27 2020 - [info]  master_pos_wait(mysql-bin.000002:154) completed on 192.168.43.143(192.168.43.143:3306). Executed 0 events.

Thu Aug 20 18:51:27 2020 - [info]  done.

Thu Aug 20 18:51:27 2020 - [info]  Resetting slave 192.168.43.143(192.168.43.143:3306) and starting replication from the new master 192.168.43.141(192.168.43.141:3306)..

Thu Aug 20 18:51:27 2020 - [info]  Executed CHANGE MASTER.

Thu Aug 20 18:51:27 2020 - [info]  Slave started.

Thu Aug 20 18:51:28 2020 - [info] End of log messages from 192.168.43.143 ...

Thu Aug 20 18:51:28 2020 - [info]

Thu Aug 20 18:51:28 2020 - [info] -- Slave switch on host 192.168.43.143(192.168.43.143:3306) succeeded.

Thu Aug 20 18:51:28 2020 - [info] Unlocking all tables on the orig master:

Thu Aug 20 18:51:28 2020 - [info] Executing UNLOCK TABLES..

Thu Aug 20 18:51:28 2020 - [info]  ok.

Thu Aug 20 18:51:28 2020 - [info] Starting orig master as a new slave..

Thu Aug 20 18:51:28 2020 - [info]  Resetting slave 192.168.43.142(192.168.43.142:3306) and starting replication from the new master 192.168.43.141(192.168.43.141:3306)..

Thu Aug 20 18:51:28 2020 - [info]  Executed CHANGE MASTER.

Thu Aug 20 18:51:28 2020 - [info]  Slave started.

Thu Aug 20 18:51:28 2020 - [info] All new slave servers switched successfully.

Thu Aug 20 18:51:28 2020 - [info]

Thu Aug 20 18:51:28 2020 - [info] * Phase 5: New master cleanup phase..

Thu Aug 20 18:51:28 2020 - [info]

Thu Aug 20 18:51:28 2020 - [info]  192.168.43.141: Resetting slave info succeeded.

Thu Aug 20 18:51:28 2020 - [info] Switching master to 192.168.43.141(192.168.43.141:3306) completed successfully.

[root@mha-manager ~]#

(4)分析转移情况

[root@slave02 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

                  Master_Host: 192.168.43.141


[root@slave01 opt]# mysql -uroot -p'Test.123' -e 'show slave status\G'  | grep Master_Host

mysql: [Warning] Using a password on the command line interface can be insecure.

                  Master_Host: 192.168.43.141

[root@slave01 opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:fefb:7918/64 scope link

      valid_lft forever preferred_lft forever

结论是:数据库主从转移成功,但vip没有迁移

(5)手动修复vip问题

[root@slave01 opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:fefb:7918/64 scope link

      valid_lft forever preferred_lft forever

[root@slave01 opt]# ip addr delete 192.168.43.145/32 dev eth0:1

[root@slave01 opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:fb:79:18 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.142/24 brd 192.168.43.255 scope global eth0

    inet6 fe80::20c:29ff:fefb:7918/64 scope link

      valid_lft forever preferred_lft forever

[root@master opt]# ifconfig eth0:1 192.168.43.145 netmask 255.255.255.255

[root@master opt]# ip a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

      valid_lft forever preferred_lft forever

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

    link/ether 00:0c:29:b2:68:80 brd ff:ff:ff:ff:ff:ff

    inet 192.168.43.141/24 brd 192.168.43.255 scope global eth0

    inet 192.168.43.145/32 brd 192.168.43.145 scope global eth0:1

    inet6 fe80::20c:29ff:feb2:6880/64 scope link

      valid_lft forever preferred_lft forever


五、总结

优点:

开源,用Perl编写。

方案成熟,故障切换时,MHA会做日志补齐操作,尽可能减少数据丢失,保证数据一。

部署不需要改变现有架构。

限制:

各个节点要打通SSH信任,有一定的安全隐患。

没有Slave的高可用。

自带的脚本不足,例如虚IP配置需要自己写命令或者依赖其他软件。

需要手动清理中继日志。

总的来说,MHA是一套非常优秀而且使用比较广的高可用程序,它可以自动补齐日志使得一致性有保证,部署的时候不需要改变原有架构就可以使用。

但是使用起来还是有一点复杂的,因为MHA不接管VIP,所以要自己写脚本实现,而且只保证Master高可用,没有Slave高可用,还有就是中继日志要自己设定时任务来清理。

不管怎么说,在没有更好的方案下,MHA还是值得使用的。


参阅:

https://blog.51cto.com/14154700/2472806

https://github.com/yoshinorim/mha4mysql-manager/wiki/Installation#installing-mha-node

https://segmentfault.com/a/1190000017486693


附:master_ip_failover文件内容(vip为192.168.43.145)

[root@mha-manager masterha]# cat /scripts/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

use MHA::DBHelper;

my (

  $command,        $ssh_user,        $orig_master_host,

  $orig_master_ip, $orig_master_port, $new_master_host,

  $new_master_ip,  $new_master_port,  $new_master_user,

  $new_master_password

);

my $vip = '192.168.43.145';

my $key = '1';

my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";

my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";

GetOptions(

  'command=s'            => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'new_master_host=s'    => \$new_master_host,

  'new_master_ip=s'      => \$new_master_ip,

  'new_master_port=i'    => \$new_master_port,

  'new_master_user=s'    => \$new_master_user,

  'new_master_password=s' => \$new_master_password,

);

exit &main();

sub main {

  print "\n\n VIP Command: start=$ssh_start_vip stop=$ssh_stop_vip\n\n";

  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

    # If you manage master ip address at global catalog database,

    # invalidate orig_master_ip here.

    my $exit_code = 1;

    eval {

      print "Disabling the VIP on old master: $orig_master_host \n";

      &stop_vip();

      # updating global catalog, etc

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

    # all arguments are passed.

    # If you manage master ip address at global catalog database,

    # activate new_master_ip here.

    # You can also grant write access (create user, set read_only=0, etc) here.

    my $exit_code = 10;

    eval {

      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print "Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

      $new_master_handler->disconnect();

      print "Enabling the VIP - $vip on the new master - $new_master_host \n";

      &start_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn $@;

      # If you want to continue failover, exit 10.

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

    print "Check script.. OK \n";

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

sub start_vip() {

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

    return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

  print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

[root@mha-manager masterha]#

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
禁止转载,如需转载请通过简信或评论联系作者。
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,390评论 5 459
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,821评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,632评论 0 319
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,170评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,033评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,098评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,511评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,204评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,479评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,572评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,341评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,213评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,576评论 3 298
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,893评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,171评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,486评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,676评论 2 335