EtcdRaft源码分析(权力转移)

因为某种原因,集群中有比现在的Leader更适合的节点可以胜任。这时候需要有种手段让指定的人当选。怎么实现呢?下面我们一起来看看实现的部分。

首先改革都是外部驱动的,正所谓旁观者清。

Client

type Node interface { 
    ...
   // TransferLeadership attempts to transfer leadership to the given transferee.
   TransferLeadership(ctx context.Context, lead, transferee uint64)
    ...
}

Follower

case pb.MsgTransferLeader:
   if r.lead == None {
      r.logger.Infof("%x no leader at term %d; dropping leader transfer msg", r.id, r.Term)
      return nil
   }
   m.To = r.lead
   r.send(m)

Follower收到这种消息,基本也是懵逼的,直接转给Leader吧

Leader

case pb.MsgTransferLeader:
   if pr.IsLearner {
      r.logger.Debugf("%x is learner. Ignored transferring leadership", r.id)
      return nil
   }
   leadTransferee := m.From
   lastLeadTransferee := r.leadTransferee
   if lastLeadTransferee != None {
      if lastLeadTransferee == leadTransferee {
         r.logger.Infof("%x [term %d] transfer leadership to %x is in progress, ignores request to same node %x",
            r.id, r.Term, leadTransferee, leadTransferee)
         return nil
      }
      r.abortLeaderTransfer()
      r.logger.Infof("%x [term %d] abort previous transferring leadership to %x", r.id, r.Term, lastLeadTransferee)
   }
   if leadTransferee == r.id {
      r.logger.Debugf("%x is already leader. Ignored transferring leadership to self", r.id)
      return nil
   }
   // Transfer leadership to third party.
   r.logger.Infof("%x [term %d] starts to transfer leadership to %x", r.id, r.Term, leadTransferee)
   // Transfer leadership should be finished in one electionTimeout, so reset r.electionElapsed.
   r.electionElapsed = 0
   r.leadTransferee = leadTransferee
   if pr.Match == r.raftLog.lastIndex() {
      r.sendTimeoutNow(leadTransferee)
      r.logger.Infof("%x sends MsgTimeoutNow to %x immediately as %x already has up-to-date log", r.id, leadTransferee, leadTransferee)
   } else {
      r.sendAppend(leadTransferee)
   }
}
  • 首先节点保存的leadTransferee是指最近一次权力转移的发起人是谁

  • 如果上次发起人跟这次发起人一样,直接忽略,因为已经在处理中了

  • 如果不一样,那么取消上次的申请,以最新的申请为主

  • 如果发起人是自己,拜托已经是leader了

  • 既然要发生转移,先选举计时清零先,如果不在一个选举周期内完成的话,那么本人要重新选举了。

  • 将这次的权力转移人保存下来

  • 如果对方的进度跟自己一致,那么给对方发MsgTimeoutNow

    func (r *raft) sendTimeoutNow(to uint64) {
       r.send(pb.Message{To: to, Type: pb.MsgTimeoutNow})
    }
    
  • 如果对方的进度还没有到达自己的水平,发MsgApp让他赶快追上来

Follower

case pb.MsgTimeoutNow:
   if r.promotable() {
      r.logger.Infof("%x [term %d] received MsgTimeoutNow from %x and starts an election to get leadership.", r.id, r.Term, m.From)
      // Leadership transfers never use pre-vote even if r.preVote is true; we
      // know we are not recovering from a partition so there is no need for the
      // extra round trip.
      r.campaign(campaignTransfer)
   } else {
      r.logger.Infof("%x received MsgTimeoutNow from %x but is not promotable", r.id, m.From)
   }

想象一下,该Follower跟Leader的entry一致,又提前发起了选举,当选简直易如反掌。

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容