主从切换
主从切换分为两种情况,一种是原master 正常关闭后,提升slave 为 新maseter;另一种是原master 没有关闭,直接提升slave 为新master,这两种方式在切换原master 为 slave时有区别
先关闭原master
关闭原master
pg_ctl -D /data/pg_data stop
提升slave
pg_ctl promote -D /data/pg_data
切换master为slave
-- 在数据目录添加standby.signal文件
touch standby.signal
-- 确保有访问新主pg_hba.conf权限
host replication replic 172.0.0.1/32 md5
-- 修改postgresql.conf 参数文件
primary_conninfo = 'host=172.0.0.2 port=5432 user=replic password=replic application_name=slave2'
#primary_slot_name = 'second_slave'
hot_standby = on
启动
pg_ctl -D /data/pg_data start
先提升slave
提升slave
pg_ctl promote -D /data/pg_data
切换master为slave
-- 在数据目录添加standby.signal文件
touch standby.signal
-- 确保有访问新主pg_hba.conf权限
host replication replic 172.0.0.1/32 md5
-- 修改postgresql.conf 参数文件
primary_conninfo = 'host=172.0.0.2 port=5432 user=replic password=replic application_name=slave2'
#primary_slot_name = 'second_slave'
hot_standby = on
启动
pg_ctl -D /data/pg_data start
修复报错
这时后会发现,slave 没有切换成功,日志里有报错
ERROR: requested starting point 0/4000000 on timeline 1 is not in this server's history
DETAIL: This server's history forked from timeline 1 at 0/3003688.
原因是主从数据不一致,需要将人工参与:
要么将数据全部删除,重新做slave,如果数据库很大的话,非常麻烦
要么借助工具pg_rewind,它会将有差异的文件或数据复制一份,对于相同的数据文件,不做改变
这里用pg_rewind 工具
要求:
1、有访问主库的权限
2、主库 full_page_writes = on
3、从库 wal_log_hints = on
4、目标数据库关闭
pg_rewind -D /data1/pg_data --source-server='host=172.0.0.2 port=5432 user=postgres password=postgres dbname=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/3003688 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/3003568 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 91 MB (total source directory size is 111 MB)
93734/93734 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
这时数据目录与主库保持一致了,需要在重新设置一遍postgresql.conf参数,重建standby.signal文件,添加pg_hba.conf访问权限,最后再重启,就可以了