因为网络问题ZK客户端会与服务器断开连接,如果断开时间超过
sessionTimeout
后,会话会被服务器清空,即使之后连接恢复,也没办法恢复会话了。这样就会导致客户端一直不能与ZK服务器通信。 本文通过监听事件,并重新建立ZK客户端的方式恢复与ZK服务器的连接。
遇到的问题
项目中有的服务器会断开与ZooKeeper服务器的连接(临时节点消失),客户端一般会出现如下日志:
21:16:31 [ main-SendThread(192.168.58.100:2181):4000645 ] - [ WARN ] Client session timed out, have not heard from server in 15526ms for sessionid 0x16797e426b8000e
21:16:31 [ main-SendThread(192.168.58.100:2181):4000645 ] - [ INFO ] Client session timed out, have not heard from server in 15526ms for sessionid 0x16797e426b8000e, closing socket connection and attempting reconnect
21:16:31 [ main-SendThread(192.168.58.100:2181):4000905 ] - [ INFO ] Socket connection established to 192.168.58.100/192.168.58.100:2181, initiating session
21:16:31 [ main-SendThread(192.168.58.100:2181):4000906 ] - [ WARN ] Unable to reconnect to ZooKeeper service, session 0x16797e426b8000e has expired
21:16:31 [ main-SendThread(192.168.58.100:2181):4000906 ] - [ INFO ] Unable to reconnect to ZooKeeper service, session 0x16797e426b8000e has expired, closing socket connection
21:16:31 [ main-EventThread:4000906 ] - [ INFO ] EventThread shut down for session: 0x16797e426b8000e
- 原因分析
ZK客户端因为网络抖动等原因与服务器断开连接,如果在sessionTimeout
时间内重新连接上,则会话继续,状态为CONNECTED
。但是如果时间超过sessinTimeout
,服务器则会进行会话的清理工作,如果此时ZK客户端才恢复连接,则会收到State为Expired的 WatchedEvent
,并断开与服务器的连接。
解决办法
当监听器Watcher收到Expired事件后,重新建立ZooKeeper客户端。如下:
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getState() == Event.KeeperState.Expired) {
try {
zooKeeper = new ZooKeeper("192.168.58.100:2181", 3000, this);
} catch (IOException e) {
log.warn("fail to connect to zoo keeper", e);
}
}
}
完整代码如下:
import org.apache.zookeeper.ZooKeeper;
import java.io.IOException;
@Slf4j
public class App implements Watcher
{
private static ZooKeeper zooKeeper = null;
public static void main( String[] args ) throws IOException, InterruptedException {
final App app = new App();
zooKeeper = new ZooKeeper("192.168.58.100:2181", 3000, app);
for (int i = 0; i < 15000; i++) {
Thread.sleep(1000);
}
}
@Override
public void process(WatchedEvent watchedEvent) {
if (watchedEvent.getState() == Event.KeeperState.Expired) {
try {
zooKeeper = new ZooKeeper("192.168.58.100:2181", 3000, this);
} catch (IOException e) {
log.warn("fail to connect to zoo keeper", e);
}
}
}
}