11.HDFS的Datanode启动异常:FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed...

        HDFS的Datanode启动异常:FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid

        # vi /var/log/hadoop-hdfs/hadoop-hdfs-datanode-chefserver.log

2020-06-14 01:41:15,857 INFO org.apache.hadoop.hdfs.server.common.Storage: Using 1 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=1, dataDirs=1)

2020-06-14 01:41:15,873 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/tmp/dfs/data/in_use.lock acquired by nodename 15373@node2.hadoop.com

2020-06-14 01:41:15,877 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/home/hadoop/tmp/dfs/data/

java.io.IOException: Incompatible clusterIDs in /home/hadoop/tmp/dfs/data: namenode clusterID = CID-c301ae20-232d-4115-a475-bd70fcec69f4; datanode clusterID = CID-d9d3ee37-5414-4f39-89ff-14ba02f7b7ec

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:779)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:302)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:418)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:397)

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:575)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1570)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1530)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:354)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:674)

        at java.lang.Thread.run(Thread.java:748)

2020-06-14 01:41:15,889 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:

java.lang.Exception

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:190)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.hasBlockPoolId(BPOfferService.java:200)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shouldRetryInit(BPOfferService.java:799)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.shouldRetryInit(BPServiceActor.java:713)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:679)

        at java.lang.Thread.run(Thread.java:748)

2020-06-14 01:41:15,890 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid 61c3b7eb-d387-4d7b-93ef-927043960018) service to node1.hadoop.com/172.26.37.245:8020. Exiting.

java.io.IOException: All specified directories are failed to load.

        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:576)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1570)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1530)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:354)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:219)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:674)

        at java.lang.Thread.run(Thread.java:748)

2020-06-14 01:41:15,890 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid 61c3b7eb-d387-4d7b-93ef-927043960018) service to node1.hadoop.com/172.26.37.245:8020

2020-06-14 01:41:15,994 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:

java.lang.Exception

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:190)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.hasBlockPoolId(BPOfferService.java:200)

        at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1485)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:437)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:457)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:708)

        at java.lang.Thread.run(Thread.java:748)

2020-06-14 01:41:15,995 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid 61c3b7eb-d387-4d7b-93ef-927043960018)

2020-06-14 01:41:15,997 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN, trace:

java.lang.Exception

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:190)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.hasBlockPoolId(BPOfferService.java:200)

        at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:1486)

        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:437)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:457)

        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:708)

        at java.lang.Thread.run(Thread.java:748)

2020-06-14 01:41:17,998 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode

2020-06-14 01:41:18,004 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0

2020-06-14 01:41:18,009 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down DataNode at node2.hadoop.com/172.26.37.246

************************************************************/

1.系统环境:

OS:CentOS Linux release 7.5.1804 (Core)

CPU:2核心

Memory:1GB

运行用户:root

JDK版本:1.8.0_252

Hadoop版本:cdh5.16.2

2.问题原因

        多次对namenode进行format,每一次format主节点NameNode产生新的clusterID、namespaceID,于是导致主节点的clusterID、namespaceID与各个子节点DataNode不一致。当format过后再启动hadoop,hadoop尝试创建新的current目录,但是由于已存在current目录,导致创建失败,最终引起DataNode节点的DataNode进程启动失败,从而引起hadoop集群完全启动失败。因此可以通过直接删除数据节点DataNode的current文件夹,进行解决该问题。

3.解决步骤

各Datanode节点删除数据目录

        # cd /home/hadoop/tmp

        # rm -rf *

各Datanode节点启动Datanode服务

        # systemctl start hadoop-hdfs-datanode.service

namenode节点初始化hdfs

        # sudo -u hdfs hdfs namenode -format

namenode节点启动

        # systemctl start hadoop-hdfs-namenode

        # systemctl status hadoop-hdfs-namenode

namenode节点上测试hdfs

        #sudo -u hdfs hadoop fs -mkdir /tmp    ####创建tmp文件夹

        #sudo -u hdfs hadoop fs -chmod -R 1777 /tmp  #### 修改权限

        #sudo -u hdfs hadoop fs -ls /          #### 查看文件

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。