- 启动
./start-dfs.sh
后jps
发现没有datanode
进程。 - 查看日志
2018-02-27 13:54:27,918 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2018-02-27 13:54:29,140 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/hadoop/app/tmp/dfs/data/in_use.lock acquired by nodename 2873@hadoop000
2018-02-27 13:54:29,161 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /home/hadoop/app/tmp/dfs/data: namenode clusterID = CID-d92efb85-6a10-4a65-abe8-451f89eb845c; datanode clusterID = CID-06da8613-d7f4-4bfa-ac2e-55104c7a265f
2018-02-27 13:54:29,162 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to hadoop000/192.168.92.128:8020. Exiting.
java.io.IOException: All specified directories are failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1394)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1355)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:228)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
at java.lang.Thread.run(Thread.java:748)
2018-02-27 13:54:29,180 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to hadoop000/192.168.92.128:8020
2018-02-27 13:54:29,288 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2018-02-27 13:54:31,290 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2018-02-27 13:54:31,292 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0
2018-02-27 13:54:31,297 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at hadoop000/192.168.92.128
关键字
2018-02-27 13:54:29,161 WARN org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: Incompatible clusterIDs in /home/hadoop/app/tmp/dfs/data: namenode clusterID = CID-d92efb85-6a10-4a65-abe8-451f89eb845c; datanode clusterID = CID-06da8613-d7f4-4bfa-ac2e-55104c7a265f
- 从日志上看是因为 datanode的clusterID 和 namenode的clusterID 不匹配。
解决方法:
根据日志中的路径
/home/hadoop/app/tmp/dfs/data
,master可以看到name
目录,salve可以看到data
目录,将
name/current
下的VERSION
中的clusterID
复制到data/current下
的VERSION
中,覆盖掉原来的clusterID
,目的是让两个保持一致。然后重启,就可以看到slave上的
DataNode
进程已经起来。
出现该问题的原因:
在第一次格式化dfs后,启动并使用了hadoop,后来又重新执行了格式化命令hdfs namenode -format
,这时namenode的clusterID
会重新生成,而datanode的clusterID
保持不变。