2，配置ssh互通，本质就是把本机的.ssh/id_rsa.pub文件传输到本机和远程主机.ssh/authorized_keys中
2.1 配置从master到其它主机无密码登录，理论上只设置此步骤即可
[Hadoop@linux1 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
/home/hadoop/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
24:be:da:90:5e:e3:ff:be:d1:4a:ce:f0:3c:55:01:3b hadoop@linux1
[hadoop@linux1 ~]$ cd .ssh
[hadoop@linux1 .ssh]$ ls
authorized_keys id_dsa id_dsa.pub id_rsa id_rsa.pub known_hosts
[hadoop@linux1 .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@linux1 .ssh]$ ssh linux1
The authenticity of host 'linux1 (172.16.251.11)' can't be established.
RSA key fingerprint is ed:1a:0b:46:f2:08:75:c6:e5:05:25:d0:7b:25:c6:61.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'linux1,172.16.251.11' (RSA) to the list of known hosts.
Last login: Mon Dec 17 09:21:37 2012 from dtydb6

scp authorized_keys linux2:/home/hadoop/.ssh/

通过以上配置，从linux1 ssh登录linux2、linux3就不再提示输入密码了
2.2 配置从其它主机登录，以下命令分别在linux2和linux3执行
2.2.1 从ssh-keygen -t rsa，生成id_rsa.pub文件
2.2.2 scp id_rsa.pub linux1:/home/hadoop/.ssh/id_rsa.pub_linux2
2.2.3 cat id_rsa.pub_linux2 >> authorized_keys
cat id_rsa.pub_linux3 >> authorized_keys
2.2.4 scp到其它所有主机
scp authorized_keys linux2:/home/hadoop/.ssh/
scp authorized_keys linux3:/home/hadoop/.ssh/
2.3 验证ssh互通是否配置完成
ssh linux2 date

3，安装hadoop
tar -zxvf hadoop-1.0.4.tar.gz
设置环境变量
export JAVA_HOME=/usr/java/jdk1.7.0_07
PATH=$PATH:$HOME/bin:/monitor/apache-flume-1.2.0/bin:/hadoop/hadoop-1.0.4/bin
默认参数设置在src/core/core-default.xml, src/hdfs/hdfs-default.xml and src/mapred/mapred-default.xml.等相关目录，个性化的配置在conf目录下的相关文件
3.1 conf/hadoop-env.sh 配置hadoop相关进程的运行参数
设置JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.7.0_07
3.2 conf/core-site.xml 设置namenode的URI访问地址
[hadoop@linux1 hadoop-1.0.4]$ vi conf/core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://linux1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-1.0.4/var</value>
</property>
</configuration>

3.3 JobTracker相关配置信息
vi conf/mapred-site.xml

[hadoop@linux1 hadoop-1.0.4]$ vi conf/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>linux1:9001</value>
</property>
</configuration>
3.4 hdfs配置信息，主要配置name和data的存放路径
[hadoop@linux1 hadoop-1.0.4]$ vi conf/hdfs-site.xml

<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/name1, /home/hadoop/name2</value>
<description> </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/data1, /home/hadoop/data2</value>
<description> </description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

3.4 配置master和slaves文件
[hadoop@linux1 conf]$ vi masters

linux1

[hadoop@linux1 conf]$ vi slaves

linux2
linux3

3.5 把配置好的配置文件，软件分发到其他主机
[hadoop@linux1 conf]$ scp * linux3:/home/hadoop/hadoop-1.0.4/conf
或者hadoop整体打包到其它主机

相关问答

是否有必要在/ usr / local中安装hadoop？(Is it necessary to install hadoop in /usr/local?)[2022-09-01]

只要HADOOP_HOME指向你提取hadoop二进制文件的位置，那就没关系了。例如，您还需要在~/.bashrc更新PATH 。 export HADOOP_HOME=/path/to/hadoop_x.yy export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin 作为参考，我在/etc/hadoop有一些配置文件。（注意： Apache Ambari使安装更容易） As long as HADOOP_HOME points to where you ...
试图建立伪dist hadoop集群感到沮丧(getting frustrated trying to set up pseudo-dist hadoop cluster)[2022-02-23]

你没有说过你使用的是vanilla Hadoop还是分发版。如果您使用的是vanilla Apache Hadoop版本，则可能需要尝试像CDH这样的发行版。 CDH5B2文档专门介绍了如何在Ubuntu中执行安装。该发行版包含YARN，Spark，Hive，Pig，Sqoop，Flume等，因此它应该满足您的所有需求。 Thank you for that. You set me down the right path. For those interested in development, go ...
HDInsight Hadoop集群和HDInsight Spark集群有什么区别？(What's difference between HDInsight Hadoop cluster & HDInsight Spark cluster?)[2023-07-13]

这些位与您注意到的相同。不同之处在于默认运行的服务和Ambari组件集（在Spark上你将有额外的spark thrift，livy，jupyter）和这些服务的配置集。因此，虽然技术上可以在hadoop集群上的纱线上运行火花作业，但不推荐使用，但某些配置可能未设置为最佳值。反过来会更可靠 - 创建火花集群并在其上运行hadoop作业。 Maxim（HDInsight Spark PM） The bits are the same as you noticed. The difference is s ...
（bdutil）无法获得hadoop / spark集群使用全新安装((bdutil) Unable to get hadoop/spark cluster working with a fresh install)[2022-02-18]

https://cloud.google.com/hadoop/downloads上最后一个版本的bdutil有点陈旧，我建议在github上使用bdutil版本： https ： //github.com/GoogleCloudPlatform/bdutil 。 The last version of bdutil on https://cloud.google.com/hadoop/downloads is a bit stale and I'd instead recommend using the ...
安装和配置多节点Hadoop集群(Installing and configuring a multi-node Hadoop cluster)[2023-05-08]

所以你想在4节点集群上安装hadoop设置！要求：1个主3个从站（在多个节点集群上安装hadoop设置）第1步：摆脱窗户。目前Hadoop可用于Linux机器。您可以拥有ubuntu 14.04或更高版本（或CentOS，Redhat等）第2步：安装和设置Java $ sudo apt-get install python-software-properties $ sudo add-apt-repository ppa：ferramroberto / java $ sudo apt-get up ...
我可以使用hadoop发行版而不是手动安装吗？(Can I use a hadoop distribution instead manually installing?)[2023-04-24]

我也是Hadoop的初学者（~1.5个月），如果你使用自动安装方式（Cloudera Manager for Cloudera或Ambari for Hortonworks），使用发行版可能非常有用。它可以非常快速地在所有集群上安装和部署您选择的Hadoop和服务（hive，impala，spark，hue ......）。我认为主要的缺点是你不能真正优化和个性化你的安装，但第一次运行一些简单的情况要容易得多。 I'm a beginner in Hadoop too (~1.5 month), usi ...
在多节点hadoop集群上安装Java的位置？(Where to install Java on multi-node hadoop cluster?)[2022-05-03]

Java是运行Hadoop的先决条件。您甚至在客户端也需要在所有机器上安装java。来到客户端配置。在客户端机器中无需安装Hadoop。它只是与Hadoop集群进行通信查看以下链接了解更多信息 Hadoop客户端节点配置 https://pravinchavan.wordpress.com/2013/06/18/submitting-hadoop-job-from-client-machine/ Java is prerequisite to run Hadoop. You need to ins ...
kafka - 可以将python程序连接到hadoop集群外的Kafka吗？(kafka - can python program connect to Kafka outside hadoop cluster?)[2023-12-03]

了解Kafka更多信息的最简单方法是使用http://landoop.com/docs/lenses/developers 您将需要运行1个docker - 在本地调出所有内容，然后开发Python应用程序，使用通过Kafka API连接到Kafka的相应Kafka库并向其生成消息一旦你构建了你的应用程序 - 然后你可以打包它并对你的Hadoop的Kafka经纪人运行它 Easiest way to learn more about Kafka is to use http://landoop.com/d ...
通过Ubuntu 14.04进行Hadoop多节点集群手动安装(Hadoop multi-node cluster manual installation over Ubuntu 14.04)[2023-05-26]

我想，Michael Noll的教程太旧了。我找到了这个网站： https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10 我的大学实验室里有一个迷你集群（有5个奴隶和一个主人）。 Ubuntu 12.04和Hadoop 2.5.0就在那里。此外，我在Ubuntu 12.04上的Hadoop 1.2.1笔记本电脑（2个奴隶和一个主人）中也有一个VM集群。但我无法在Ubuntu 14.04中 ...
在现有Hadoop集群上安装Spark（带HIVE的ISSUE）(Install Spark on existing Hadoop cluster (ISSUE with HIVE))[2022-09-14]

您可能需要在启动spark之前添加mysql连接器jar文件...在我的情况下，我添加了如下所示的mysql连接器jar。 $SPARK_HOME/bin/compute-classpath.sh CLASSPATH=$CLASSPATH:/opt/big/hive/lib/mysql-connector-java-5.1.25-bin.jar You may need to add mysql connector jar file before you start spark... In my ca ...

知识点

相关文章

最近更新

install cluster Hadoop 安装集群版Hadoop

相关问答

是否有必要在/ usr / local中安装hadoop？(Is it necessary to install hadoop in /usr/local?)[2022-09-01]

试图建立伪dist hadoop集群感到沮丧(getting frustrated trying to set up pseudo-dist hadoop cluster)[2022-02-23]

HDInsight Hadoop集群和HDInsight Spark集群有什么区别？(What's difference between HDInsight Hadoop cluster & HDInsight Spark cluster?)[2023-07-13]

（bdutil）无法获得hadoop / spark集群使用全新安装((bdutil) Unable to get hadoop/spark cluster working with a fresh install)[2022-02-18]

安装和配置多节点Hadoop集群(Installing and configuring a multi-node Hadoop cluster)[2023-05-08]

我可以使用hadoop发行版而不是手动安装吗？(Can I use a hadoop distribution instead manually installing?)[2023-04-24]

在多节点hadoop集群上安装Java的位置？(Where to install Java on multi-node hadoop cluster?)[2022-05-03]

kafka - 可以将python程序连接到hadoop集群外的Kafka吗？(kafka - can python program connect to Kafka outside hadoop cluster?)[2023-12-03]

通过Ubuntu 14.04进行Hadoop多节点集群手动安装(Hadoop multi-node cluster manual installation over Ubuntu 14.04)[2023-05-26]

在现有Hadoop集群上安装Spark（带HIVE的ISSUE）(Install Spark on existing Hadoop cluster (ISSUE with HIVE))[2022-09-14]