前言
raft 论文断断续续看了好久了, etcd的raft实现也是断断续续看了好久, 最近又想起了,再回顾的时候发现好多细节又是忘了, 还是做些简单记录吧,记性不好还不肯动笔,那就完蛋了..
progress
progress 是leader维护的各个follower的状态信息, 总共有三种状态: probe
, replicate
, snapshot
, 其内部的状态机如下转换
+--------------------------------------------------------+
| send snapshot |
| |
+---------+----------+ +----------v---------+
+---> probe | | snapshot |
| | max inflight = 1 <----------------------------------+ max inflight = 0 |
| +---------+----------+ +--------------------+
| | 1. snapshot success
| | (next=snapshot.index + 1)
| | 2. snapshot failure
| | (no change)
| | 3. receives msgAppResp(rej=false&&index>lastsnap.index)
| | (match=m.index,next=match+1)
receives msgAppResp(rej=true)
(next=match+1)| |
| |
| |
| | receives msgAppResp(rej=false&&index>match)
| | (match=m.index,next=match+1)
| |
| |
| |
| +---------v----------+
| | replicate |
+---+ max inflight = n |
+--------------------+
raft 的membership change 一点小区别.
etcd的实现
The key invariant that membership changes happen one node at a time is preserved, but in our implementation the membership change takes effect when its entry is applied, not when it is added to the log (so the entry is committed under the old membership instead of the new)
raft 论文的论述
once a given server adds the lastest configuation entry to its log. it uses the the latest configuration in its log. regardless of whether the entry is commited.... This means that the leader will use the rules of Cnew.new to determine when the log entry for Cnew.old is committed