首页 \ 问答 \ 与Hadoop一起理解Spark(Understanding Spark alongside Hadoop)

与Hadoop一起理解Spark(Understanding Spark alongside Hadoop)

 在我的设置中，Hadoop和Spark都在同一网络上运行，但在不同的节点上运行。 我们可以将Spark作为单独的服务启动，与现有的Hadoop集群一起运行。 它会显示性能有任何改善吗？  
 我有大约10 GB的数千个文件加载到HDFS中。  
 我有8个节点用于Hadoop，1个主节点和5个工作用于Spark 

In the set up I have, both Hadoop and Spark are running on the same network but on different nodes. We can run Spark alongside your existing Hadoop cluster by just launching it as a separate service. Will it show any improvement in performance?  
I have thousands of files around 10 GB loaded into HDFS.  
I have 8 nodes for Hadoop, 1 master and 5 workers for Spark

原文：https://stackoverflow.com/questions/26867910

更新时间：2023-06-13 22:06

最满意答案

 好的，对不起，以上实际上工作正常 - 这是我的程序中的一个错误。 

Ok, sorry, the above actually works fine-- it was a bug in my program.

相关问答

如何查询SOLR的空字段？(How to query SOLR for empty fields?)[2022-03-15]

尝试这个： ?q=-id:["" TO *] Try this: ?q=-id:["" TO *]
适用于C＃的Rally Lookback API(Rally Lookback API for C#)[2021-09-28]

目前， .NET REST和Java REST工具包中没有内置的Lookback API支持。另请参阅此文章作为一个不完美的解决方法，请参阅此帖提及解析修订。您可以查询故事并获取RevisionHistory，Revisions和Description，并迭代结果解析“SCHEDULE STATE changed”字符串的各个修订的描述。 There is currently no built-in support for Lookback API in .NET REST tookit and in ...
Lookback API：查询空日期字段(Lookback API: queries for empty date fields)[2022-12-20]

好的，对不起，以上实际上工作正常 - 这是我的程序中的一个错误。 Ok, sorry, the above actually works fine-- it was a bug in my program.
如何根据版本名称或OID查询Rally的Lookback API？(How to query Rally's Lookback API based on Release Name or OID?)[2023-08-12]

哎呀，我最后的简单错误，但我会发布它，以防其他人试图做同样的事情：项目OID必须与你试图查询的版本中的项目相同。或者更好的是，只需删除项目参数！ Oops, simple mistake on my end but I'll post it just in case anyone else is trying to do the same thing: the project OID has to be the same as the project which is in the release you ...
Lookback API删除未经授权的快照(Lookback API remove unauthorized snapshots)[2024-01-06]

你是对的。 removeUnauthorizedSnapshots过滤当前的pagesize结果集，这意味着当所有结果都是或者曾经与不允许用户访问的项目相关联时，它实际上可能会返回一个结果为0的页面。当你得到更多结果时，我不确定结果。额外的过滤器应该只限制结果的数量，当我使用类似的代码时，我看到进一步减少。但我想建议对Parent属性的过滤器进行语法更改。根本没有在Lookback API中存储空，所以任何！= null或== null查询都有点误导。在你的代码中它可以工作，但是在Parent = ...
Java Rally api lookback api身份验证(Java Rally api lookback api authentication)[2022-11-13]

回顾工具包是一个实验工具包，并不完全支持。 https://github.com/RallyTools/Rally-Lookback-Toolkit 我怀疑它是否包含对api密钥的支持。根据文档，你应该只使用setCredentials https://github.com/RallyTools/Rally-Lookback-Toolkit#rally-lookback-api-toolkit .setCredentials("username", "password") The lookback to ...
Lookback API：错误403访问被拒绝(Lookback API: Error 403 Access Is Denied)[2022-12-01]

该错误表示查询触及了用户无权查看的项目。将_ProjectHierarchy添加到条件会更改范围并将结果限制为该层次结构中的项目。例如，一个查询，如： { "PlanEstimate" : 5 } 将尝试返回工作区中的所有快照，计划估计值为5.将其更改为： { "PlanEstimate" : 5, "_ProjectHierarchy" : 1234 } 将结果更改为项目1234中的所有快照或其计划估计值为5的子项之一，可能是一组非常不同的结果。如果用户可以访问工作区中的所有项目，那么他们就不应该得 ...
TFS API在特定日期范围内获取TestResults(TFS API get TestResults in a specific date range)[2023-04-17]

正确的格式是： var testresults = teamProject.TestResults.Query("SELECT * FROM TestResult WHERE DateCompleted < '2017-05-24 07:41:44.137'"); 字段为DateCompleted ，时间格式为2017-05-24 00:00:00.000 。注意：存储在数据库中的DateCompleted使用UTC时间，因此当您运行查询时，应将本地时间转换为UTC时间以获得更准确的结果 The co ...
Lookback API：有没有办法在单个查询中包含多个工件类型？(Lookback API: Is there a way to include more than one artifact type in a single query?)[2022-03-13]

当然，只需在_Type:{$in:["Defect","HierarchicalRequirement"]}添加_Type:{$in:["Defect","HierarchicalRequirement"]}即可。所有工作项类型都存储在同一个集合中。您还可以获取后代任务和测试任务。 Sure, just add _Type:{$in:["Defect","HierarchicalRequirement"]} to the query. All of the work item types are sto ...
在Lookback API中按ID查询任务后没有获得结果字段(Not getting results field after querying task by ID in Lookback API)[2022-03-06]

亚历杭德罗，你要求的是字段的变化，而不是字段的值。这是对回顾api的常见误解。有一种特殊的方法可以获得Agile Central中可用的帮助页面中显示的当前值。返回的任何信息实际上都保存在“原始”和“数据”下面的对象中。如果在拍摄快照时没有对这些字段进行“更改”，则每个值都可能不包含任何值。 Alejandro, you are asking for the changes in the fields, not the values of the fields. This is a common m ...

Hadoop vs Spark性能对比

对比Hadoop，Spark受多方追捧的原因

Spark连接Hadoop读取HDFS问题小结

为什么要使用Spark？

Apache Spark源码走读之8 -- Spark on Yarn

初步了解Spark生态系统及Spark Streaming

大数据处理 Hadoop、HBase、ElasticSearch、Storm、Kafka、Spark

Spark，一种快速数据分析替代方案

与Hadoop一起理解Spark(Understanding Spark alongside Hadoop)

最满意答案

相关问答

如何查询SOLR的空字段？(How to query SOLR for empty fields?)[2022-03-15]

适用于C＃的Rally Lookback API(Rally Lookback API for C#)[2021-09-28]

Lookback API：查询空日期字段(Lookback API: queries for empty date fields)[2022-12-20]

如何根据版本名称或OID查询Rally的Lookback API？(How to query Rally's Lookback API based on Release Name or OID?)[2023-08-12]

Lookback API删除未经授权的快照(Lookback API remove unauthorized snapshots)[2024-01-06]

Java Rally api lookback api身份验证(Java Rally api lookback api authentication)[2022-11-13]

Lookback API：错误403访问被拒绝(Lookback API: Error 403 Access Is Denied)[2022-12-01]

TFS API在特定日期范围内获取TestResults(TFS API get TestResults in a specific date range)[2023-04-17]

Lookback API：有没有办法在单个查询中包含多个工件类型？(Lookback API: Is there a way to include more than one artifact type in a single query?)[2022-03-13]

在Lookback API中按ID查询任务后没有获得结果字段(Not getting results field after querying task by ID in Lookback API)[2022-03-06]

相关文章

最新问答