Redis第2️⃣2️⃣课 Cluster故障转移

一、故障发现

节点间通过ping / pong 消息实现故障发现:不需要sentinel。ping / pong 不仅传播节点槽的信息(参见前面章节),亦可以传播主从状态,节点故障,

1. 主观下线

定义:某一个节点认为另一个节点不可用,“偏见”

主观下线流程
2. 客观下线

定义:当半数以上持有槽的主节点都标记某节点主观下线

客观下线逻辑流程
尝试客观下线
  • 通知集群内所有节点标记故障节点为客观下线
  • 通知故障节点的从节点触发故障转移流程


二、故障恢复

1. 资格检查
  • 每个从节点检查与故障主节点的断线时间
  • 超过(cluster-node-timeout * cluster-slave-validity-factor)取消资格。
  • cluster-slave-validity-factor默认是10
2. 准备选举时间
准备选举时间

  为了保证偏移量大的节点有更小的延迟达到选举时间,为了保证数据的一致性更高。偏移量较大的更有可能成为未来的master节点,所以我们给他更小的选举时间,让它首先达到选举时间,然后完成未来的选举,票数多。

3. 选举投票
选举投票

1) 当前从节点取消复制变为主节点(slave of no one)
2)执行clusterDelSlot撤销故障主节点复制的槽,并执行clusterAddSlot 把这些槽分给自己
3) 向集群中广播自己的pong消息,表面已经替换了故障主节点。

4. 替换主节点


三、故障转移实战演练

故障演练示例图解

1)kill某主节点

#查询集群节点信息
$ redis-cli --cluster info localhost:7000
localhost:7000 (1ac9fbbf...) -> 2 keys | 5461 slots | 1 slaves.
127.0.0.1:7001 (a3c0d3b4...) -> 2 keys | 5462 slots | 1 slaves. # 将要kill掉的主节点
127.0.0.1:7002 (a89a427b...) -> 1 keys | 5461 slots | 1 slaves.
#查看某节点的进程号
$ redis-cli -p 7002 info Server | grep process_id
process_id:4386
# 循环遍历查询程序报异常,过一会儿自己好了
kill 4386
$ redis-cli --cluster info localhost:7000
Could not connect to Redis at 127.0.0.1:7002: Connection refused
localhost:7000 (1ac9fbbf...) -> 2 keys | 5461 slots | 1 slaves.
127.0.0.1:7005 (09792d31...) -> 1 keys | 5461 slots | 0 slaves. #新主节点
127.0.0.1:7001 (a3c0d3b4...) -> 2 keys | 5462 slots | 1 slaves.
$ redis-cli -p 7000 cluster slots
1) 1) (integer) 10923
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 7005
      3) "09792d31e728ad714a5a90bc7639f277d817fb4e"
2) 1) (integer) 5461
   2) (integer) 10922
   3) 1) "127.0.0.1"
      2) (integer) 7001
      3) "a3c0d3b42da023dc402faf439d4f93a1cb44d402"
   4) 1) "127.0.0.1"
      2) (integer) 7004
      3) "5a4f085dee8400093f45ce2cfa42cbd206167f73"
3) 1) (integer) 0
   2) (integer) 5460
   3) 1) "127.0.0.1"
      2) (integer) 7000
      3) "1ac9fbbfe11362e151204132e3d110b18139a1d9"
   4) 1) "127.0.0.1"
      2) (integer) 7003
      3) "2d19dda2a8a790d5636a664fe3ed54aa3dd7677c"
2)新晋主节点日志:redis-cluster-7005.log(原被kill掉的master的slave)
 79 4394:S 31 May 2019 12:09:53.401 # Connection with master lost.
 80 4394:S 31 May 2019 12:09:53.404 * Caching the disconnected master state.
 81 4394:S 31 May 2019 12:09:53.971 * Connecting to MASTER 127.0.0.1:7002
 82 4394:S 31 May 2019 12:09:53.972 * MASTER <-> REPLICA sync started
 83 4394:S 31 May 2019 12:09:53.973 # Error condition on socket for SYNC: Connection refused
 84 4394:S 31 May 2019 12:09:54.987 * Connecting to MASTER 127.0.0.1:7002
 85 4394:S 31 May 2019 12:09:54.988 * MASTER <-> REPLICA sync started
 86 4394:S 31 May 2019 12:09:54.989 # Error condition on socket for SYNC: Connection refused
 87 4394:S 31 May 2019 12:09:56.000 * Connecting to MASTER 127.0.0.1:7002
 88 4394:S 31 May 2019 12:09:56.001 * MASTER <-> REPLICA sync started
 89 4394:S 31 May 2019 12:09:56.002 # Error condition on socket for SYNC: Connection refused
 90 4394:S 31 May 2019 12:09:57.010 * Connecting to MASTER 127.0.0.1:7002
 91 4394:S 31 May 2019 12:09:57.011 * MASTER <-> REPLICA sync started
 92 4394:S 31 May 2019 12:09:57.012 # Error condition on socket for SYNC: Connection refused
 93 4394:S 31 May 2019 12:09:58.025 * Connecting to MASTER 127.0.0.1:7002
 94 4394:S 31 May 2019 12:09:58.026 * MASTER <-> REPLICA sync started
 95 4394:S 31 May 2019 12:09:58.027 # Error condition on socket for SYNC: Connection refused
 96 4394:S 31 May 2019 12:09:59.038 * Connecting to MASTER 127.0.0.1:7002
 97 4394:S 31 May 2019 12:09:59.039 * MASTER <-> REPLICA sync started
 98 4394:S 31 May 2019 12:09:59.040 # Error condition on socket for SYNC: Connection refused
 99 4394:S 31 May 2019 12:10:00.051 * Connecting to MASTER 127.0.0.1:7002
100 4394:S 31 May 2019 12:10:00.051 * MASTER <-> REPLICA sync started
101 4394:S 31 May 2019 12:10:00.053 # Error condition on socket for SYNC: Connection refused
102 4394:S 31 May 2019 12:10:01.063 * Connecting to MASTER 127.0.0.1:7002
103 4394:S 31 May 2019 12:10:01.064 * MASTER <-> REPLICA sync started
104 4394:S 31 May 2019 12:10:01.065 # Error condition on socket for SYNC: Connection refused
105 4394:S 31 May 2019 12:10:02.076 * Connecting to MASTER 127.0.0.1:7002
106 4394:S 31 May 2019 12:10:02.077 * MASTER <-> REPLICA sync started
107 4394:S 31 May 2019 12:10:02.078 # Error condition on socket for SYNC: Connection refused
108 4394:S 31 May 2019 12:10:03.089 * Connecting to MASTER 127.0.0.1:7002
109 4394:S 31 May 2019 12:10:03.090 * MASTER <-> REPLICA sync started
110 4394:S 31 May 2019 12:10:03.091 # Error condition on socket for SYNC: Connection refused
111 4394:S 31 May 2019 12:10:04.099 * Connecting to MASTER 127.0.0.1:7002
112 4394:S 31 May 2019 12:10:04.100 * MASTER <-> REPLICA sync started
113 4394:S 31 May 2019 12:10:04.101 # Error condition on socket for SYNC: Connection refused
114 4394:S 31 May 2019 12:10:05.111 * Connecting to MASTER 127.0.0.1:7002
115 4394:S 31 May 2019 12:10:05.111 * MASTER <-> REPLICA sync started
116 4394:S 31 May 2019 12:10:05.112 # Error condition on socket for SYNC: Connection refused
117 4394:S 31 May 2019 12:10:06.121 * Connecting to MASTER 127.0.0.1:7002
118 4394:S 31 May 2019 12:10:06.121 * MASTER <-> REPLICA sync started
119 4394:S 31 May 2019 12:10:06.122 # Error condition on socket for SYNC: Connection refused
120 4394:S 31 May 2019 12:10:07.135 * Connecting to MASTER 127.0.0.1:7002
121 4394:S 31 May 2019 12:10:07.136 * MASTER <-> REPLICA sync started
122 4394:S 31 May 2019 12:10:07.137 # Error condition on socket for SYNC: Connection refused
123 4394:S 31 May 2019 12:10:08.149 * Connecting to MASTER 127.0.0.1:7002
124 4394:S 31 May 2019 12:10:08.149 * MASTER <-> REPLICA sync started
125 4394:S 31 May 2019 12:10:08.150 # Error condition on socket for SYNC: Connection refused
126 4394:S 31 May 2019 12:10:09.157 * Connecting to MASTER 127.0.0.1:7002
127 4394:S 31 May 2019 12:10:09.158 * MASTER <-> REPLICA sync started
128 4394:S 31 May 2019 12:10:09.159 # Error condition on socket for SYNC: Connection refused
#从7001获取信息失败,主观失败的消息
129 4394:S 31 May 2019 12:10:09.532 * FAIL message received from a3c0d3b42da023dc402faf439d4f93a1cb44d402 about a89a427b5fe8b2b0ef07ac8c6252d    c3c8efa1f77
130 4394:S 31 May 2019 12:10:09.565 # Start of election delayed for 925 milliseconds (rank #0, offset 249926).
131 4394:S 31 May 2019 12:10:10.173 * Connecting to MASTER 127.0.0.1:7002
132 4394:S 31 May 2019 12:10:10.173 * MASTER <-> REPLICA sync started
133 4394:S 31 May 2019 12:10:10.174 # Error condition on socket for SYNC: Connection refused
 # 开始新的选举
134 4394:S 31 May 2019 12:10:10.578 # Starting a failover election for epoch 13.
 # 选举胜出,我是新的master
135 4394:S 31 May 2019 12:10:10.591 # Failover election won: I'm the new master.  
136 4394:S 31 May 2019 12:10:10.591 # configEpoch set to 13 after successful failover
137 4394:M 31 May 2019 12:10:10.592 # Setting secondary replication ID to 27803313625ab7581c806b2a8343d1aff567354b, valid up to offset: 24992    7. New replication ID is 7083e19600c686aece101102f81bede77a55e6dc
138 4394:M 31 May 2019 12:10:10.593 * Discarding previously cached master state.

故障恢复时间 = 主观下线时间 + 客观下线时间 + 选举时间

  大概不到20秒。如果你无法容忍这个时间,那么可以把sendTimeout调小。但是这个参数会影响到带宽的传播速率、消息在节点中传播的频率,可能会加重带宽。所以这个参数的设置是一般是根据实际情况综合考量而得出的结果。

3)重启被kill的主节点
$ redis-server ../etc/cluster/redis-7002.conf 

#kill掉的7002变成了7005的从
$ redis-cli -p 7000 cluster slots
1) 1) (integer) 10923
   2) (integer) 16383
   3) 1) "127.0.0.1"
      2) (integer) 7005
      3) "09792d31e728ad714a5a90bc7639f277d817fb4e"
   4) 1) "127.0.0.1"
      2) (integer) 7002
      3) "a89a427b5fe8b2b0ef07ac8c6252dc3c8efa1f77"
2) 1) (integer) 5461
   2) (integer) 10922
   3) 1) "127.0.0.1"
      2) (integer) 7001
      3) "a3c0d3b42da023dc402faf439d4f93a1cb44d402"
   4) 1) "127.0.0.1"
      2) (integer) 7004
      3) "5a4f085dee8400093f45ce2cfa42cbd206167f73"
3) 1) (integer) 0
   2) (integer) 5460
   3) 1) "127.0.0.1"
      2) (integer) 7000
      3) "1ac9fbbfe11362e151204132e3d110b18139a1d9"
   4) 1) "127.0.0.1"
      2) (integer) 7003
      3) "2d19dda2a8a790d5636a664fe3ed54aa3dd7677c"

redis-cluster-7002.log

$ tail -30 redis-cluster-7002.log
28746:C 31 May 2019 20:53:03.405 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
28746:C 31 May 2019 20:53:03.407 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=28746, just started
28746:C 31 May 2019 20:53:03.407 # Configuration loaded
28747:M 31 May 2019 20:53:03.410 * Increased maximum number of open files to 10032 (it was originally set to 256).
28747:M 31 May 2019 20:53:03.412 * Node configuration loaded, I'm a89a427b5fe8b2b0ef07ac8c6252dc3c8efa1f77
28747:M 31 May 2019 20:53:03.413 * Running mode=cluster, port=7002.
28747:M 31 May 2019 20:53:03.414 # Server initialized
28747:M 31 May 2019 20:53:03.415 * DB loaded from disk: 0.001 seconds
28747:M 31 May 2019 20:53:03.416 * Ready to accept connections

# 重新配置自己为xxxId节点的从节点
28747:M 31 May 2019 20:53:03.419 # Configuration change detected. Reconfiguring myself as a replica of 09792d31e728ad714a5a90bc7639f277d817fb4e
28747:S 31 May 2019 20:53:03.419 * Before turning into a replica, using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
28747:S 31 May 2019 20:53:03.420 # Cluster state changed: ok
#连接到主节点7005
28747:S 31 May 2019 20:53:04.430 * Connecting to MASTER 127.0.0.1:7005
#开始主从数据同步
28747:S 31 May 2019 20:53:04.431 * MASTER <-> REPLICA sync started
28747:S 31 May 2019 20:53:04.431 * Non blocking connect for SYNC fired the event.
28747:S 31 May 2019 20:53:04.432 * Master replied to PING, replication can continue...
28747:S 31 May 2019 20:53:04.433 * Trying a partial resynchronization (request 8931dcb4de60e18b8f9835b25f828cebf564c1cf:1).
28747:S 31 May 2019 20:53:04.441 * Full resync from master: 7318c71d3e107b0896c561f9f1c5294d43619178:249926
28747:S 31 May 2019 20:53:04.441 * Discarding previously cached master state.
28747:S 31 May 2019 20:53:04.513 * MASTER <-> REPLICA sync: receiving 192 bytes from master
28747:S 31 May 2019 20:53:04.515 * MASTER <-> REPLICA sync: Flushing old data
28747:S 31 May 2019 20:53:04.516 * MASTER <-> REPLICA sync: Loading DB in memory
28747:S 31 May 2019 20:53:04.516 * MASTER <-> REPLICA sync: Finished with success

redis-cluster-7005.log

4394:M 31 May 2019 13:10:11.573 * Replication backlog freed after 3600 seconds without connected replicas.
4394:M 31 May 2019 20:53:03.500 * Clear FAIL state for node a89a427b5fe8b2b0ef07ac8c6252dc3c8efa1f77: master without slots is reachable again.
4394:M 31 May 2019 20:53:04.434 * Replica 127.0.0.1:7002 asks for synchronization
4394:M 31 May 2019 20:53:04.434 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '8931dcb4de60e18b8f9835b25f828cebf564c1cf', my replication IDs are 'd1547b3a6d4eb61969a5cd19f55f907e2f18b10c' and '0000000000000000000000000000000000000000')
4394:M 31 May 2019 20:53:04.436 * Starting BGSAVE for SYNC with target: disk
4394:M 31 May 2019 20:53:04.440 * Background saving started by pid 28748
28748:C 31 May 2019 20:53:04.448 * DB saved on disk
4394:M 31 May 2019 20:53:04.511 * Background saving terminated with success
4394:M 31 May 2019 20:53:04.513 * Synchronization with replica 127.0.0.1:7002 succeeded
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,928评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,192评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,468评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,186评论 1 286
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,295评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,374评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,403评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,186评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,610评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,906评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,075评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,755评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,393评论 3 320
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,079评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,313评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,934评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,963评论 2 351

推荐阅读更多精彩内容

  • 故障转移 Redis集群自身实现了高可用。高可用首先需要解决集群部分失败的场景:当集群内少量节点出现故障时通过自动...
    linuxzw阅读 609评论 0 2
  • redis集群分为服务端集群和客户端分片,redis3.0以上版本实现了集群机制,即服务端集群,3.0以下使用客户...
    hadoop_null阅读 1,589评论 0 6
  • Redis Cluster原理分析 文章较长,如需转载可分段。转载请标明作者以及文章来源,谢谢! 作者介绍 姓名:...
    lihanglucien阅读 20,445评论 3 30
  • 故障转移 接着上章构建的sentinel网络构建后分析sentinel的故障转移。sentinel本身做为redi...
    ben1988阅读 3,371评论 1 0
  • redis提供了哨兵和自动分区(cluster)两种方案提供高可用性。 一 哨兵 1. 哨兵环境搭建 1.1 wi...
    阿狸404阅读 403评论 0 0