Hadoop: The Definitive Guide【PDF版】

2019-03-28 14:12|来源: 网络

内容简介 · · · · · ·

  Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The... (展开全部)   Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The book includes case studies that illustrate how Hadoop solves specific problems.

  Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The Definitive Guide provides you with the key for unlocking the wealth this data holds. Hadoop is ideal for storing and processing massive amounts of data, but until now, information on this open-source project has been lacking -- especially with regard to best practices. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters.

Hadoop: The Definitive Guide【PDF版】

  With case studies that illustrate how Hadoop solves specific problems, this book helps you:

  * Learn the Hadoop Distributed File System (HDFS), including ways to use its many APIs to transfer data

  * Write distributed computations with MapReduce, Hadoop's most vital component

  * Become familiar with Hadoop's data and IO building blocks for compression, data integrity, serialization, and persistence

  * Learn the common pitfalls and advanced features for writing real-world MapReduce programs

  * Design, build, and administer a dedicated Hadoop cluster

  * Use HBase, Hadoop's database for structured and semi-structured data

  And more. Hadoop: The Definitive Guide is still in progress, but you can get started on this technology with the Rough Cuts edition, which lets you read the book online or download it in PDF format as the manuscript evolves.  

免费下载地址在 http://linux.linuxidc.com/

用户名与密码都是www.linuxidc.com

具体下载目录在 /2012年资料/1月/10日/Hadoop:The Definitive Guide【PDF版】/

相关问答

更多
  • 一步步 : 添加cloudera your settings.xml(在$ {HOME} /。m2 / settings.xml下)以访问hadoop依赖项 cloudera https://repository.cloudera.com/artifactory/cloudera-repos true ...
  • 那应该是 q ~ /[01459]/ 你错过了终止斜线。 That should be q ~ /[01459]/ You are missing the terminating slash.
  • 标准文档中有perldoc perlstyle 。 关于代码的美学,关于Larry唯一关心的唯一事情是多行BLOCK的结束括号应该与启动构造的关键字对齐。 除此之外,他还有其他偏好并不那么强烈。 拉里对这些事情都有他的理由,但他并没有声称其他人的思想和他的思想一样。 如果您安装Perl::Tidy模块,它将包含程序/工具perltidy ,它将重新格式化您的代码以符合Larry Wall的偏好,如perlstyle 。 通过命令行参数-pbp ,它将符合Damian Conway在其Perl Best Pr ...
  • 我创建了一个小提琴来提供正在讨论的代码的实例 。 如果您使用的是Chrome或类似的现代浏览器,则可以右键单击示例中的段落,然后选择“检查元素”以查看填充和边距的交互方式。 我同意,该段落是可怕的。 它首先说明“水平边距不会崩溃” ,然后提供一个示例,其中没有相邻的水平边距来演示断言。 该示例确实具有相邻的垂直边距(第一段的下边距和第二段的上边距),我们可以看到它们会崩溃(它们占据相同的空间:检查一个元素然后另一个看到这个)。 然而,这与示例中演示的非堆叠偏移无关,即段落的非相邻边距从包含div的填充边缘偏 ...
  • 看起来你正在使用hadoop的后级版本。 检查您的图形构建器版本所需的hadoop版本,并确保它是您正在运行的版本。 Looks like you're using a back level version of hadoop. Check the version of hadoop that your version of graph builder needs and make sure that's the version you're running.
  • 有一些活跃的发布系列。 1.x版本系列是0.20版本系列的延续。 在0.23释放后几周,0.20分支(以前称为0.20.205)重新编号为1.0。 在0.20.205和1.0之间几乎没有功能差异。 这只是一个重新编号。 0.23包括几个主要的新功能,包括一个名为MapReduce 2的新MapReduce运行时,它在一个名为YARN(Yet Another Resource Negotiator)的新系统上实现,它是一个用于运行分布式应用程序的通用资源管理系统。 同样,2.x版本是0.23版本系列的延续。 ...
  • 如果你有一个Mapper的实现,其中map()的调用可能需要很长时间(如超过几分钟),那么你可以定期调用提供的context对象的progress()以让Hadoop知道你的代码不是雄。 这就是“明确报告进度”的意思 - 当你使用实现Progressable的框架提供的对象时,它就起作用了,当你编写自己的Progressable实现时,它显然不会那样工作。 If you have an implementation of Mapper where an invocation of map() may tak ...
  • 命令的语法有点不同: hadoop fs -cat hdfs:///user/tom/quangle.txt 在你的道路上有你的家吗? 你能不用任何参数调用hadoop吗? The syntax of the command is a little bit different: hadoop fs -cat hdfs:///user/tom/quangle.txt Do you have hadoop home in your path? can you call hadoop without any ...
  • 这就是我的理解。 如果你仔细阅读,重要的是要记住: 请注意,这不会改变轮数; 它只是一种优化,可以最大限度地减少写入磁盘的数据量,因为最后一轮总是直接合并到reduce中。 无论是否进行优化,合并轮次数都保持不变 (第一种情况下为5次,第二种情况下为4次)。 第一种情况:50个文件合并到最后5个,然后直接进入“减少”阶段(总轮数为5 + 1 = 6) 第二种情况:34个文件合并到最后4个文件中,其余6个文件直接从内存中读取并送入“减少”阶段(总轮数为4 + 1 = 5) 在这两种情况下,合并轮次数由配置ma ...
  • 这可能是因为您正在使用旧的map / reduce序列文件类。 而不是使用 -inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat org.apache.hadoop.mapred.SequenceFileOutputFormat 尝试使用 -inFormat org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat; -outFormat org.apac ...