Hadoop报错Incompatible namespaceIDs解决一例

2019-03-28 13:18|来源: 网络

突然发现使用-put命令往HDFS里传数据传不上去了,抱一大堆错误,然后我使用bin/Hadoop dfsadmin -report查看系统状态

admin@ www.linuxidc.com:/home/admin/joe.wangh/hadoop-0.19.2>bin/hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: ?%

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

 

使用bin/stop-all.sh关闭HADOOP

 

admin@ www.linuxidc.com:/home/admin/joe.wangh/hadoop-0.19.2>bin/stop-all.sh
stopping jobtracker
172.16.197.192: stopping tasktracker
172.16.197.193: stopping tasktracker
stopping namenode
172.16.197.193: no datanode to stop
172.16.197.192: no datanode to stop
172.16.197.191: stopping secondarynamenode

 

哦,看到了吧,发现datanode前面并没有启动起来。去DATANODE上查看一下日志

admin@adw2:/home/admin/joe.wangh/hadoop-0.19.2/logs>vi hadoop-admin-datanode-adw2.hst.ali.dw.alidc.net.log

************************************************************/
2010-07-21 10:12:11,987 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/admin/joe.wangh/hadoop/data/dfs.data.dir: namenode namespaceID = 898136669; datanode namespaceID = 2127444065
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:288)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:206)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1239)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1194)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1202)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1324)
......

 

错误提示namespaceIDs不一致。

 

下面给出两种解决办法,我使用的是第二种。


Workaround 1: Start from scratch

I can testify that the following steps solve this error, but the side effects won't make you happy (me neither). The crude workaround I have found is to:

1.    stop the cluster

2.    delete the data directory on the problematic datanode: the directory is specified by dfs.data.dir in conf/hdfs-site.xml; if you followed this tutorial, the relevant directory is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data

3.    reformat the namenode (NOTE: all HDFS data is lost during this process!)

4.    restart the cluster

When deleting all the HDFS data and starting from scratch does not sound like a good idea (it might be ok during the initial setup/testing), you might give the second approach a try.

Workaround 2: Updating namespaceID of problematic datanodes

Big thanks to Jared Stehler for the following suggestion. I have not tested it myself yet, but feel free to try it out and send me your feedback. This workaround is "minimally invasive" as you only have to edit one file on the problematic datanodes:

1.    stop the datanode

2.    edit the value of namespaceID in <dfs.data.dir>/current/VERSION to match the value of the current namenode

3.    restart the datanode

If you followed the instructions in my tutorials, the full path of the relevant file is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data/current/VERSION (background: dfs.data.dir is by default set to ${hadoop.tmp.dir}/dfs/data, and we set hadoop.tmp.dir to /usr/local/hadoop-datastore/hadoop-hadoop).

If you wonder how the contents of VERSION look like, here's one of mine:

#contents of <dfs.data.dir>/current/VERSION

namespaceID=393514426

storageID=DS-1706792599-10.10.10.1-50010-1204306713481

cTime=1215607609074

storageType=DATA_NODE

layoutVersion=-13

原因:每次namenode format会重新创建一个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有晴空datanode下的数据,导致启动时失败,所要做的就是每次fotmat前,清空tmp一下的所有目录.

相关问答

更多
  • 最后一个有好几个选项的么。具体是哪一个错了? 说清楚点,好“对症下药”
  • jps在java安装目录的bin目录下 你可以到java-bin目录下去执行 或者把java-bin添加到PATH的环境变量中
  • 会不会你自动执行drop table的时候那个table还不存在, 可以考虑partition,每天删除partition,
  • 1,Android Studio开发时,App机器人位置(select run/debug Configuration)位置出现红叉导致程序不能运行的解决方法: clean与rebulde可能都不好用。 产生问题的原因 : 文件换包的原因导致你的 AndroidManifest.xml 文件注册乱了。 解决方法:你也可以去AndroidManifest.xml 文件中查看一下你注册的Activity有没有哪个报错(主要看你刚换包的那些)。 最有可能是清单文件错了问题:检查清单文件中 应用程序包名和activ ...
  • Clent.HConnectionManager$HConnectionImplementation:Check the value with configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 直观来看,自己去查看zookeeper.znode.parent的配置是否正确。在Hbase/conf/core-site.xml中自己配置的zookeepe ...
  • datanode1的namespaceID与当前的namenode不匹配。也许你从另一个集群中复制了/ usr / local / hadoop / tmp / dfs / data目录。如果datanode1的数据不相关,你可以删除/ usr / local / hadoop datanode1的/ tmp / dfs / * the namespaceID of datanode1 do not match the current namenode's.Maybe you copied the /usr ...
  • 每次启动服务时,您必须格式化namenode。 Namenode应该只格式化一次。 解决方案是删除临时文件夹,然后格式化namenode并启动服务。 下次无论何时启动服务,都不要格式化namenode bcz,此步骤必须只执行一次。 You must be formatting namenode everytime when you are starting service. Namenode should be formatted only once. The solution is to delete ...
  • 在Hadoop中,您的数据绝对很小。 最新的电脑有16+ GB的RAM,因此您的数据集可以完全适合单台机器的内存。 但是,这并不意味着您至少可以尝试将数据加载到HDFS并对其执行一些操作。 Sqoop&Hive将成为您用来加载和处理SQL的工具。 但是,由于我提出了关于内存的观点,因此完全可行,您不需要Hadoop(HDFS和YARN),而是可以使用Apache Spark w / SparkSQL直接从分布式JDBC连接访问MySQL。 In Hadoop terms, your data is defi ...
  • 我曾经发生过几次这种事情。 如果重新启动数据节点不起作用,请执行以下操作: 重启Hadoop 转到/ app / hadoop / tmp / dfs / name / current 打开版本(即通过vim VERSION ) 记录namespaceID 转到/ app / hadoop / tmp / dfs / data / current 打开版本(即通过vim VERSION ) 将namespaceID替换为您在步骤4中记录的namespaceID。 这应该解决这个问题。 I've had th ...
  • 没有看到你的其余代码,我想到的唯一答案是检查实例 if (convertView instanceof ImageView) { } else { } without seeing the rest of your code, the only answer that comes to my mind is to check with instance of if (convertView instanceof ImageView) { } else { }