Namenode HA(UnknownHostException:nameservice1)(Namenode HA (UnknownHostException: nameservice1))
我们使用Cloudera Manager启用Namenode High Availability
Cloudera Manager >> HDFS >>操作>启用高可用性>>选定Standn By Namenode和日志节点然后nameservice1
完成整个过程后,部署客户端配置。
通过列出HDFS目录(hadoop fs -ls /)从客户机测试,然后手动故障转移到备用名称节点并再次列出HDFS目录(hadoop fs -ls /)。 这个测试完美地工作。
但是当我使用以下命令运行hadoop睡眠作业时,失败了
$ hadoop jar /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop-0.20-mapreduce/hadoop-examples.jar sleep -m 1 -r 0 java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2308) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2342) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2324) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:980) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:948) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1410) at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:174) at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:237) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.SleepJob.main(SleepJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.net.UnknownHostException: nameservice1 ... 37 more
即使在部署客户端配置后,我也不知道为什么它无法解析nameservice1。
当我谷歌这个问题时,我发现这个问题只有一个解决方案
在配置条目中添加以下条目以解决问题dfs.nameservices = nameservice1 dfs.ha.namenodes.nameservice1 = namenode1,namenode2 dfs.namenode.rpc-address.nameservice1.namenode1 = ip-10-118-137-215.ec2 .internal:8020 dfs.namenode.rpc-address.nameservice1.namenode2 = ip-10-12-122-210.ec2.internal:8020 dfs.client.failover.proxy.provider.nameservice1 = org.apache.hadoop.hdfs .server.namenode.ha.ConfiguredFailoverProxyProvider
我的印象是Cloudera Manager负责管理它。 我检查了客户端的配置和配置(/var/run/cloudera-scm-agent/process/1998-deploy-client-config/hadoop-conf/hdfs-site.xml)。
还有一些配置文件的更多细节:
[11:22:37 root@datasci01.dev:~]# ls -l /etc/hadoop/conf.cloudera.* /etc/hadoop/conf.cloudera.hdfs: total 16 -rw-r--r-- 1 root root 943 Jul 31 09:33 core-site.xml -rw-r--r-- 1 root root 2546 Jul 31 09:33 hadoop-env.sh -rw-r--r-- 1 root root 1577 Jul 31 09:33 hdfs-site.xml -rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties /etc/hadoop/conf.cloudera.hdfs1: total 20 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1890 May 21 15:48 core-site.xml -rw-r--r-- 1 root root 2546 May 21 15:48 hadoop-env.sh -rw-r--r-- 1 root root 1577 May 21 15:48 hdfs-site.xml -rw-r--r-- 1 root root 314 May 21 15:48 log4j.properties /etc/hadoop/conf.cloudera.mapreduce: total 20 -rw-r--r-- 1 root root 1032 Jul 31 09:33 core-site.xml -rw-r--r-- 1 root root 2775 Jul 31 09:33 hadoop-env.sh -rw-r--r-- 1 root root 1450 Jul 31 09:33 hdfs-site.xml -rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties -rw-r--r-- 1 root root 2446 Jul 31 09:33 mapred-site.xml /etc/hadoop/conf.cloudera.mapreduce1: total 24 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml -rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh -rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml -rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties -rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml [11:23:12 root@datasci01.dev:~]#
我怀疑它在/etc/hadoop/conf.cloudera.hdfs1&/etc/hadoop/conf.cloudera.mapreduce1中的旧配置问题,但不确定。
看起来像/ etc / hadoop / conf / *永远不会更新
# ls -l /etc/hadoop/conf/ total 24 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml -rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh -rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml -rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties -rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml
任何人有关于这个问题的想法?
谢谢
We enable Namenode High Availability through Cloudera Manager, using
Cloudera Manager >> HDFS >> Action > Enable High Availability >> Selected Stand By Namenode & Journal Nodes Then nameservice1
Once the whole process completed then Deployed Client Configuration.
Tested from Client Machine by listing HDFS directories (hadoop fs -ls /) then manually failover to standby namenode & again listing HDFS directories (hadoop fs -ls /). This test worked perfectly.
But When I ran hadoop sleep job using following command it failed
$ hadoop jar /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop-0.20-mapreduce/hadoop-examples.jar sleep -m 1 -r 0 java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2308) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2342) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2324) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194) at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:980) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:974) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:974) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:948) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1410) at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:174) at org.apache.hadoop.examples.SleepJob.run(SleepJob.java:237) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.SleepJob.main(SleepJob.java:165) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.net.UnknownHostException: nameservice1 ... 37 more
I dont know why its not able to resolved nameservice1 even after deploying client configuration.
When I google this issue I found only one solution to this issue
Add the below entry in configuration entry for fix the issue dfs.nameservices=nameservice1 dfs.ha.namenodes.nameservice1=namenode1,namenode2 dfs.namenode.rpc-address.nameservice1.namenode1=ip-10-118-137-215.ec2.internal:8020 dfs.namenode.rpc-address.nameservice1.namenode2=ip-10-12-122-210.ec2.internal:8020 dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
My impression was Cloudera Manager take cares of it. I checked client for this configuration & configuration was there (/var/run/cloudera-scm-agent/process/1998-deploy-client-config/hadoop-conf/hdfs-site.xml).
Also some more details of config files :
[11:22:37 root@datasci01.dev:~]# ls -l /etc/hadoop/conf.cloudera.* /etc/hadoop/conf.cloudera.hdfs: total 16 -rw-r--r-- 1 root root 943 Jul 31 09:33 core-site.xml -rw-r--r-- 1 root root 2546 Jul 31 09:33 hadoop-env.sh -rw-r--r-- 1 root root 1577 Jul 31 09:33 hdfs-site.xml -rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties /etc/hadoop/conf.cloudera.hdfs1: total 20 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1890 May 21 15:48 core-site.xml -rw-r--r-- 1 root root 2546 May 21 15:48 hadoop-env.sh -rw-r--r-- 1 root root 1577 May 21 15:48 hdfs-site.xml -rw-r--r-- 1 root root 314 May 21 15:48 log4j.properties /etc/hadoop/conf.cloudera.mapreduce: total 20 -rw-r--r-- 1 root root 1032 Jul 31 09:33 core-site.xml -rw-r--r-- 1 root root 2775 Jul 31 09:33 hadoop-env.sh -rw-r--r-- 1 root root 1450 Jul 31 09:33 hdfs-site.xml -rw-r--r-- 1 root root 314 Jul 31 09:33 log4j.properties -rw-r--r-- 1 root root 2446 Jul 31 09:33 mapred-site.xml /etc/hadoop/conf.cloudera.mapreduce1: total 24 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml -rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh -rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml -rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties -rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml [11:23:12 root@datasci01.dev:~]#
I doubt its issue with old configuration in /etc/hadoop/conf.cloudera.hdfs1 & /etc/hadoop/conf.cloudera.mapreduce1 , but not sure.
looks like /etc/hadoop/conf/* never got updated
# ls -l /etc/hadoop/conf/ total 24 -rwxr-xr-x 1 root root 233 Sep 5 2013 container-executor.cfg -rw-r--r-- 1 root root 1979 May 16 12:20 core-site.xml -rw-r--r-- 1 root root 2775 May 16 12:20 hadoop-env.sh -rw-r--r-- 1 root root 1450 May 16 12:20 hdfs-site.xml -rw-r--r-- 1 root root 314 May 16 12:20 log4j.properties -rw-r--r-- 1 root root 2446 May 16 12:20 mapred-site.xml
Anyone has any idea about this issue?
原文:https://stackoverflow.com/questions/25062788
最满意答案
在进一步测试后看看Avro源代码,GenericRecord上的toString()方法似乎是由GenericData.Record.toString()实现的,它调用了GenericData.toString()。 这个方法的javadoc声明它应该提供记录的有效JSON表示,它就是这样做的。
但是,它的实现与JsonEncoder的不同之处在于JsonEncoder使用了Jackson库,并且更加关注Avro架构。 GenericRecord.toString()方法只是使用StringBuilder遍历记录并构建JSON表示,并且不会如此密切关注Avro架构。
这意味着有些情况下调用toString()将生成一个无法使用JSONDecoder反序列化的JSON表示,例如在模式包含联合的情况下。
基于此,看起来像toString()方法是获取记录的人类可读表示的简单方便的方法,但作为根据模式序列化数据的方式是不可靠的。
After some further testing a look at the Avro source, it seems that the toString() method on GenericRecord is implemented by GenericData.Record.toString(), which calls GenericData.toString(). The javadoc on this method states that it's supposed to provide a valid JSON representation of the record, which it sort of does.
However, it differs in its implementation from the JsonEncoder, in that the JsonEncoder makes use of the Jackson libraries, and pays closer attention to the Avro schema. The GenericRecord.toString() method simply walks the record and builds the JSON representation using a StringBuilder, and doesn't pay such close attention to the Avro schema.
This means there are cases when calling toString() will produce a JSON representation that can't be deserialized using the JSONDecoder, for example in cases where the schema contains unions.
Based on this is looks like the toString() method is a simple and convenient way to get a human-readable representation of the record, but is unreliable as a way to serialize the data according to the schema.
相关问答
更多-
下列中不属于面向对象的编程语言的是?[2022-05-30]
a -
在进一步测试后看看Avro源代码,GenericRecord上的toString()方法似乎是由GenericData.Record.toString()实现的,它调用了GenericData.toString()。 这个方法的javadoc声明它应该提供记录的有效JSON表示,它就是这样做的。 但是,它的实现与JsonEncoder的不同之处在于JsonEncoder使用了Jackson库,并且更加关注Avro架构。 GenericRecord.toString()方法只是使用StringBuilder遍 ...
-
如何将RDD [GenericRecord]转换为scala中的dataframe?(How to convert RDD[GenericRecord] to dataframe in scala?)[2022-02-16]
我花了一些时间来尝试这项工作(特别是如何正确地反序列化数据,但看起来你已经覆盖了这个)...更新 //Define function to convert from GenericRecord to Row def genericRecordToRow(record: GenericRecord, sqlType : SchemaConverters.SchemaType): Row = { val objectArray = new Array[Any](record.asInstance ... -
由于您使用阴影插件,这些类是该文件的一部分 target/uber-giorgos-fx_currencies-1.0-SNAPSHOT.jar 用它来运行你的代码。 Since you're using the shade plugin, those classes are part of this file target/uber-giorgos-fx_currencies-1.0-SNAPSHOT.jar Use that to run your code.
-
如果我理解你的问题,我建议你尝试使用com.twitter.bijection.Injection和com.twitter.bijection.avro.GenericAvroCodecs包。 看看http://aseigneurin.github.io/2016/03/04/kafka-spark-avro-producing-and-consuming-avro-messages.html 。 在那里,在Kafka生产者中,GenericRecord被转换为bytes [],它们放在Kafka主题中,然 ...
-
Avro有八种基本类型和五种复杂类型(不包括其他类型的组合)。 下表将这13种Avro类型映射到它们的输入接口(可以put GenericRecord的Java类型)及其输出实现(从GenericRecord get的具体Java类型)。 这些值适用于Avro 1.7.7。 ╔═══════════╦════════════════════════╦═══════════════════════════╗ ║ Avro Type ║ Input Interface ║ Output Implementati ...
-
没有简单的方法 - 但我确切地知道你要做什么。 这是一个动态扩展现有模式(例如SchemaBuilder)的示例。 Schema schema = SchemaBuilder .record("schema_base").namespace("com.namespace.test") .fields() .name("longField").type().longType().noDefault() .n ...
-
有SchemaConverters.createConverterToSQL但遗憾的是它是私有的。 有PR公开,但它们从未被合并: https://github.com/databricks/spark-avro/pull/89 https://github.com/databricks/spark-avro/pull/132 虽然我们使用了一个解决方法。 您可以通过在com.databricks.spark.avro包中创建一个类来公开它: package com.databricks.spark.avr ...
-
看起来解决方案是使用AvroKey而不是AvroWrapper 。 下面的代码工作,所有org.apache.avro.util.Utf8将自动转换为java.lang.String 。 不再例外。 val inputData = sc.newAPIHadoopFile(inputPath, classOf[AvroKeyInputFormat[GenericRecord]], classOf[AvroKey[GenericRecord]], classOf[NullWritable]).map(t => ...
-
访问AVRO GenericRecord(Java / Scala)中的嵌套字段(Accessing nested fields in AVRO GenericRecord (Java/Scala))[2022-11-01]
你可以使用avro序列化库来帮助你。 例如https://github.com/sksamuel/avro4s (我是作者),但也有其他人。 你只需要为你获得的数据类型定义一个case类,这可以包括嵌套的case类。 例如, case class Boo(d: Boolean) case class Foo(a: String, b: Int, c: Boo) 然后创建一个RecordFormat类型类的实例。 val format = RecordFormat[Foo] 最后,您可以使用它来提取记录或 ...