leader每隔ticktime的1/2个时间,就发一次ping请求。
同时检查所有的从节点是否跟自己处于synced状态。
synced状态的判定有两点:
- LearnerHandler线程未退出
- 当前的tick小于tickOfNextAckDeadline。tickOfNextAckDeadline默认是上一次接收到请求的时间点加上syncLimit的时间点。
LearnerHandler什么时候退出?socket超时的时候会退出,默认的socket超时时间是self.tickTime * self.syncLimit。sock一开始的超时是self.tickTime * self.initLimit,但是同步完成后,就会调整成syncLimit。
也就是说如果Leader在syncLimit所规定的时间内,接收不到任何从节点发送的请求,那么LearnerHandler就会退出。
这里又分两种退出场景,一种是PING消息,如果PING消息没有在超时时间内得到反馈,会退出。
具体的代码可以参见:
LearnerHandler类的run方法,在等待follower的消息的时候,socket是有超时的,超时后,这个LearnerHandler就会退出了,socket的超时就是self.tickTime * syncLimit。
第二种场景,如果主发起一个事务,该事务没有在超时时间内得到反馈,也会退出。所以如果ZooKeeper所用的硬盘太慢,fsync太耗时,也有可能导致ZooKeeper频繁切主。这个是由SyncLimitCheck类来控制的,相关的代码见:
/**
* This class controls the time that the Leader has been
* waiting for acknowledgement of a proposal from this Learner.
* If the time is above syncLimit, the connection will be closed.
* It keeps track of only one proposal at a time, when the ACK for
* that proposal arrives, it switches to the last proposal received
* or clears the value if there is no pending proposal.
*/
private class SyncLimitCheck {
private boolean started = false;
private long currentZxid = 0;
private long currentTime = 0;
private long nextZxid = 0;
private long nextTime = 0;
public synchronized void start() {
started = true;
}
public synchronized void updateProposal(long zxid, long time) {
if (!started) {
return;
}
if (currentTime == 0) {
currentTime = time;
currentZxid = zxid;
} else {
nextTime = time;
nextZxid = zxid;
}
}
public synchronized void updateAck(long zxid) {
if (currentZxid == zxid) {
currentTime = nextTime;
currentZxid = nextZxid;
nextTime = 0;
nextZxid = 0;
} else if (nextZxid == zxid) {
LOG.warn("ACK for " + zxid + " received before ACK for " + currentZxid + "!!!!");
nextTime = 0;
nextZxid = 0;
}
}
public synchronized boolean check(long time) {
if (currentTime == 0) {
return true;
} else {
long msDelay = (time - currentTime) / 1000000;
return (msDelay < (leader.self.tickTime * leader.self.syncLimit));
}
}
};