2.1异常分析:
2021-03-19 14:32:29,708 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop1/192.168.111.111:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-19 14:32:29,718 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop2/192.168.111.112:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-19 14:32:29,735 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop3/192.168.111.113:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-19 14:32:29,737 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.111.111:8485, 192.168.111.112:8485, 192.168.111.113:8485], stream=null))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.111.111:8485: Call From hadoop1/192.168.111.111 to hadoop1:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
192.168.111.113:8485: Call From hadoop1/192.168.111.111 to hadoop3:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
192.168.111.112:8485: Call From hadoop1/192.168.111.111 to hadoop2:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
失败原因:
NameNode作为JournalNode的客户端发起连接请求,但是失败了,然后NameNode又向其他节点依次发起了请求都失败了,直至到了最大重试次数。JournalNode并没有准备好,而NameNode已经用完了所有重试次数。
vi $HADOOP_HOME/etc/hadoop/core-site.xml中增加以下配置
<property>
<name>ipc.client.connect.max.retries</name>
<value>100</value>
</property>
<property>
<name>ipc.client.connect.retry.interval</name>
<value>10000</value>
</property>