首页 \ 问答 \ 云上的BigInsights - 未找到类org.apache.oozie.action.hadoop.SparkMain(BigInsights on cloud - Class org.apache.oozie.action.hadoop.SparkMain not found)

云上的BigInsights - 未找到类org.apache.oozie.action.hadoop.SparkMain(BigInsights on cloud - Class org.apache.oozie.action.hadoop.SparkMain not found)

我正在尝试针对Apache Hadoop基本集群的BigInsights执行oozie_spark分支上的spark oozie 示例

workflow.xml如下所示:

<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkWordCount'>
 <start to='spark-node' />
  <action name='spark-node'>
   <spark xmlns="uri:oozie:spark-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <master>${master}</master>
    <name>Spark-Wordcount</name>
    <class>org.apache.spark.examples.WordCount</class>
    <jar>${hdfsSparkAssyJar},${hdfsWordCountJar}</jar>
    <spark-opts>--conf spark.driver.extraJavaOptions=-Diop.version=4.2.0.0</spark-opts>
    <arg>${inputDir}/FILE</arg>
    <arg>${outputDir}</arg>
   </spark>
   <ok to="end" />
   <error to="fail" />
  </action>
  <kill name="fail">
   <message>Workflow failed, error
    message[${wf:errorMessage(wf:lastErrorNode())}]
   </message>
  </kill>
 <end name='end' />
</workflow-app>

configuration.xml:

<configuration>
    <property>
        <name>master</name>
        <value>local</value>
    </property>
    <property>
        <name>queueName</name>
        <value>default</value>
    </property>
    <property>
        <name>user.name</name>
        <value>default</value>
    </property>
    <property>
        <name>nameNode</name>
        <value>default</value>
    </property>
    <property>
        <name>jobTracker</name>
        <value>default</value>
    </property>
    <property>
        <name>jobDir</name>
        <value>/user/snowch/test</value>
    </property>
    <property>
        <name>inputDir</name>
        <value>/user/snowch/test/input</value>
    </property>
    <property>
        <name>outputDir</name>
        <value>/user/snowch/test/output</value>
    </property>
    <property>
        <name>hdfsWordCountJar</name>
        <value>/user/snowch/test/lib/OozieWorkflowSparkGroovy.jar</value>
    </property>
    <property>
        <name>oozie.wf.application.path</name>
        <value>/user/snowch/test</value>
    </property>
    <property>
        <name>hdfsSparkAssyJar</name>
        <value>/iop/apps/4.2.0.0/spark/jars/spark-assembly.jar</value>
    </property>
</configuration>

但是,我在Yarn日志中看到的错误是:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exception invoking main(), java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:234)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    ... 13 more

我在火花装配中寻找SparkMain:

$ hdfs dfs -get /iop/apps/4.2.0.0/spark/jars/spark-assembly.jar
$ jar tf spark-assembly.jar | grep -i SparkMain

和这里:

$ jar tf /usr/iop/4.2.0.0/spark/lib/spark-examples-1.6.1_IBM_4-hadoop2.7.2-IBM-12.jar | grep SparkMain

我已经看到了另一个与此类似的问题 ,但这个问题特别针对关于云的BigInsights。


I'm trying to execute the spark oozie example on the oozie_spark branch against a BigInsights for Apache Hadoop basic cluster.

The workflow.xml looks like this:

<workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkWordCount'>
 <start to='spark-node' />
  <action name='spark-node'>
   <spark xmlns="uri:oozie:spark-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <master>${master}</master>
    <name>Spark-Wordcount</name>
    <class>org.apache.spark.examples.WordCount</class>
    <jar>${hdfsSparkAssyJar},${hdfsWordCountJar}</jar>
    <spark-opts>--conf spark.driver.extraJavaOptions=-Diop.version=4.2.0.0</spark-opts>
    <arg>${inputDir}/FILE</arg>
    <arg>${outputDir}</arg>
   </spark>
   <ok to="end" />
   <error to="fail" />
  </action>
  <kill name="fail">
   <message>Workflow failed, error
    message[${wf:errorMessage(wf:lastErrorNode())}]
   </message>
  </kill>
 <end name='end' />
</workflow-app>

The configuration.xml:

<configuration>
    <property>
        <name>master</name>
        <value>local</value>
    </property>
    <property>
        <name>queueName</name>
        <value>default</value>
    </property>
    <property>
        <name>user.name</name>
        <value>default</value>
    </property>
    <property>
        <name>nameNode</name>
        <value>default</value>
    </property>
    <property>
        <name>jobTracker</name>
        <value>default</value>
    </property>
    <property>
        <name>jobDir</name>
        <value>/user/snowch/test</value>
    </property>
    <property>
        <name>inputDir</name>
        <value>/user/snowch/test/input</value>
    </property>
    <property>
        <name>outputDir</name>
        <value>/user/snowch/test/output</value>
    </property>
    <property>
        <name>hdfsWordCountJar</name>
        <value>/user/snowch/test/lib/OozieWorkflowSparkGroovy.jar</value>
    </property>
    <property>
        <name>oozie.wf.application.path</name>
        <value>/user/snowch/test</value>
    </property>
    <property>
        <name>hdfsSparkAssyJar</name>
        <value>/iop/apps/4.2.0.0/spark/jars/spark-assembly.jar</value>
    </property>
</configuration>

However, the error I see in the Yarn logs is:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], exception invoking main(), java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:234)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
    at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SparkMain not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
    ... 13 more

I've looked for SparkMain in spark-assembly:

$ hdfs dfs -get /iop/apps/4.2.0.0/spark/jars/spark-assembly.jar
$ jar tf spark-assembly.jar | grep -i SparkMain

And here:

$ jar tf /usr/iop/4.2.0.0/spark/lib/spark-examples-1.6.1_IBM_4-hadoop2.7.2-IBM-12.jar | grep SparkMain

I've seen another question similar to this one, but this question is specifically about BigInsights on cloud.


原文:https://stackoverflow.com/questions/38899561
更新时间:2023-01-23 21:01

相关文章

更多

最新问答

更多
  • 您如何使用git diff文件,并将其应用于同一存储库的副本的本地分支?(How do you take a git diff file, and apply it to a local branch that is a copy of the same repository?)
  • 将长浮点值剪切为2个小数点并复制到字符数组(Cut Long Float Value to 2 decimal points and copy to Character Array)
  • OctoberCMS侧边栏不呈现(OctoberCMS Sidebar not rendering)
  • 页面加载后对象是否有资格进行垃圾回收?(Are objects eligible for garbage collection after the page loads?)
  • codeigniter中的语言不能按预期工作(language in codeigniter doesn' t work as expected)
  • 在计算机拍照在哪里进入
  • 使用cin.get()从c ++中的输入流中丢弃不需要的字符(Using cin.get() to discard unwanted characters from the input stream in c++)
  • No for循环将在for循环中运行。(No for loop will run inside for loop. Testing for primes)
  • 单页应用程序:页面重新加载(Single Page Application: page reload)
  • 在循环中选择具有相似模式的列名称(Selecting Column Name With Similar Pattern in a Loop)
  • System.StackOverflow错误(System.StackOverflow error)
  • KnockoutJS未在嵌套模板上应用beforeRemove和afterAdd(KnockoutJS not applying beforeRemove and afterAdd on nested templates)
  • 散列包括方法和/或嵌套属性(Hash include methods and/or nested attributes)
  • android - 如何避免使用Samsung RFS文件系统延迟/冻结?(android - how to avoid lag/freezes with Samsung RFS filesystem?)
  • TensorFlow:基于索引列表创建新张量(TensorFlow: Create a new tensor based on list of indices)
  • 企业安全培训的各项内容
  • 错误:RPC失败;(error: RPC failed; curl transfer closed with outstanding read data remaining)
  • C#类名中允许哪些字符?(What characters are allowed in C# class name?)
  • NumPy:将int64值存储在np.array中并使用dtype float64并将其转换回整数是否安全?(NumPy: Is it safe to store an int64 value in an np.array with dtype float64 and later convert it back to integer?)
  • 注销后如何隐藏导航portlet?(How to hide navigation portlet after logout?)
  • 将多个行和可变行移动到列(moving multiple and variable rows to columns)
  • 提交表单时忽略基础href,而不使用Javascript(ignore base href when submitting form, without using Javascript)
  • 对setOnInfoWindowClickListener的意图(Intent on setOnInfoWindowClickListener)
  • Angular $资源不会改变方法(Angular $resource doesn't change method)
  • 在Angular 5中不是一个函数(is not a function in Angular 5)
  • 如何配置Composite C1以将.m和桌面作为同一站点提供服务(How to configure Composite C1 to serve .m and desktop as the same site)
  • 不适用:悬停在悬停时:在元素之前[复制](Don't apply :hover when hovering on :before element [duplicate])
  • 常见的python rpc和cli接口(Common python rpc and cli interface)
  • Mysql DB单个字段匹配多个其他字段(Mysql DB single field matching to multiple other fields)
  • 产品页面上的Magento Up出售对齐问题(Magento Up sell alignment issue on the products page)