zookeeper节点故障之Unable to load database on disk

前言

今天在使用自己搭建的虚拟机测试时,发现3台zookeeper中有一台起不来,具体情况如下:

故障节点

$ZK_HOME/bin/zkServer.sh start 

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

但是jps查看进程的时候没有,其他2台是正常的。

查看zk的日志发现:

2021-05-10 11:34:38,908 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380001e0ca len = 0 byte = 0
2021-05-10 11:34:38,908 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.380000f211 len = 0 byte = 0
2021-05-10 11:34:38,909 [myid:1] - INFO  [main:Util@190] - Invalid snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500034cd6 len = 0 byte = 0
2021-05-10 11:34:38,909 [myid:1] - INFO  [main:FileSnap@83] - Reading snapshot /opt/module/zookeeper-3.4.10/tmp/version-2/snapshot.3500021022
2021-05-10 11:34:38,917 [myid:1] - ERROR [main:QuorumPeer@648] - Unable to load database on disk
java.io.IOException: 输入/输出错误
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:255)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at org.apache.zookeeper.server.persistence.FileTxnLog$PositionInputStream.read(FileTxnLog.java:452)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
    at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585)
    at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604)
    at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570)
    at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:552)
    at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:531)
    at org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:358)
    at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:140)
    at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
    at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:601)
    at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:591)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:164)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
    at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
2021-05-10 11:34:38,918 [myid:1] - ERROR [main:QuorumPeerMain@89] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server

提示故障机器的snapshot无效,无法从磁盘加载。

具体怎么做呢?

解决方案

由于另外两台机器是正常的,我们可以将故障机器的zk数据文件夹备份一下,让其从正常运行的节点之一复制快照

步骤如下:

mv $ZK_HOME/tmp/version-2   $ZK_HOME/tmp/version-2.bak

$ZK_HOME/bin/zkServer.sh start 

jps查看进程是否启动。并查看$ZK_HOME/tmp/version-2是否同步新的snapshot。

--by 俩只猴

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容