1.namenode格式化失败
18/11/16 15:05:05 WARN namenode.NameNode: Encountered exception during format:
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.100.132:8485: false
192.168.100.133:8485: false
1 exceptions thrown:
192.168.100.131:8485: Call From hadoop01/192.168.100.128 to hadoop02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:232)
at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:899)
at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:171)
at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:941)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1512)
18/11/16 15:05:05 FATAL namenode.NameNode: Failed to start namenode.
没有格式化journalnode
hadoop-daemon.sh start journalnode
2.zookeeper启动失败
问题描述:使用zkServer.sh start命令启动zookeeper后,查看zookeeper状态,发现并没有启动成功。
ZooKeeper JMX enabled by default
Using config: /opt/software/zookeeper-3.4.10/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.
很大程度上是上一次kill掉zookeeper后,端口依旧被占用。
查看端口状态:netstat -ntpl
kill掉占用2181端口的进程。
3.namenode启动失败
问题描述:因为断电导致,需要重启集群,发现其中一个namenode节点启动失败,另一个namenode正常启动。通过日志发现以下错误
java.io.IOException: Premature EOF from inputStream
java.io.IOException: Failed to load an FSImage file!
这个错误与namenode的安全模式有关,因为断电的缘故,导致namenode A节点上的fsimage文件丢失,所以启动失败。
解决方法:在alive状态的namenode B节点上找到XX/dfs/name文件,将整个current文件发送给namenode A节点上。然后重启集群即可