1、安装java和ssh
在Ubuntu下使用apt-get就可以很方便地将JDK和ssh安装好，Ubuntu一般默认安装有ssh客户端，并没有安装服务器端，输入"apt-get install ssh"便会将服务器安装好，然后使用"/etc/init.d/ssh start"将服务器运行起来。
2、创建hadoop用户组和hadoop用户
#addgroup hadoop
#adduser --ingroup hadoop hadoop
3、配置ssh
切换到hadoop用户下
#su - hadoop
生成密钥对
hadoop@ubuntu:~$ssh-keygen -t rsa -P ""
将公钥拷贝到服务器上
hadoop@ubuntu:~$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
4、安装Hadoop
Hadoop不需要安装解压后就可以用了，以root用户运行下面的命令。
#cd /usr/local
#tar xzf hadoop-0.20.0.tar.gz
#mv hadoop-0.20.0 hadoop
#chown -R hadoop:hadoop hadoop

5、配置Hadoop
打开conf/hadoop-env.sh，修改其中一句就ok了。将“#export JAVA_HOME=/usr/lib/j2sdk1.5-sun”改成“export JAVA_HOME=/usr/lib/jvm/java-6-sun“就好了，当然要看安装的java版本了，Ubuntu 9.10的源的Java版本就是1.6。
接着修改core-site.xml文件，填入以下内容（/local/hadoop-datastore/hadoop-hadoop目录必须存在，并且需要将目录属主改成hadoop用户，${user.name}这个变量不知道是哪儿定义的）：
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-datastore/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>

</configuration>
然后编辑mapred-site.xml文件，输入以下内容：
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

</configuration>
原文中似乎将这两段配置均放在了hadoop-site.xml配置文件中，0.20.0版本之后的Hadoop似乎不再有这个配置文件了，取而代之是core-site.xml，如果将这些内容全部放入这个文件中会出问题。TaskTracker和JobTracker将运行不起来，log记录的错误为：
2009-10-31 21:43:28,399 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.RuntimeException: Not a host:port pair: local
这个错误让我郁闷了好久，偶然的机会在一个台湾的网页上看到说必须将第二段放入mapred-site.xml文件中，这样果然ok了 ^_^

知识点

相关文章

最近更新

Ubuntu下Hadoop环境的配置

相关问答

配置单机hadoop 环境[2024-02-24]

hadoop2.7.2配置环境变量[2022-05-09]

如何配置一个环境变量HADOOP[2022-02-23]

如何配置Hadoop环境[2022-06-19]

如何配置Hadoop环境[2021-12-24]

hadoop开发环境配置[2023-10-03]

如何配置Hadoop环境[2023-02-09]

ubuntu下配置hadoop不能正常运行。。求帮忙[2023-01-29]

ubuntu中安装hadoop，配置文件出现问题，求救，十万火急！！！！！[2024-02-21]

ubuntu怎么配置单机hadoop[2023-09-06]