请先参照Hadoop单机模式配置安装好java,以及参照Hadoop伪分布模式配置安装好SSH
Hadoop请按以下过程安装(二者区别在于避免权限的问题安装在用户主目录~)。
准备工作
hostname | user | IP Address |
master | hadoop | 192.168.1.121 |
slave1 | hadoop | 192.168.1.122 |
创建hadoop用户组和用户
sudo addgroup hadoop sudo adduser --ingroup hadoop hadoop sudo gedit /etc/sudoers
在root ALL=(ALL:ALL) ALL下添加
hadoop ALL=(ALL:ALL) ALL
修改主机名(对应地改成master、slave1)
sudo vi /etc/hostname
修改hosts
sudo vi /etc/hosts
127.0.0.1 localhost 192.168.1.121 master 192.168.1.122 slave1
关闭防火墙(重启生效)
sudo ufw disable
SSH
进入master的.ssh目录
cd ~/.ssh scp authorized_keys hadoop@slave1:~/.ssh/ authorized_keys_from_cloud001
进入slave1的.ssh目录
cat authorized_keys_from_cloud001>> authorized_keys
至此,可以在master上面ssh hadoop@slave1进行无密码登陆了
Hadoop安装和配置 (先重启,以hadoop用户登录)
释放文件并软连接(用户以后版本更新)
cd ~/setupEnv tar zxvf hadoop-2.2.0.tar.gz -C ~ ln -s ~/hadoop-2.2.0 ~/hadoop
修改系统环境变量
sudo gedit /etc/profile
在末尾添加以下内容 :
#hadoop export HADOOP_HOME=~/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
将配置文件启用:
source /etc/profile
创建Hadoop的数据存储目录
cd ~/hadoop mkdir tmp
使用备份的配置 文件
cd ~/setupEnv/hadoop_distribute_mode_setting sudo cp core-site.xml ~/hadoop/etc/hadoop sudo cp hadoop-env.sh ~/hadoop/etc/hadoop sudo cp hdfs-site.xml ~/hadoop/etc/hadoop sudo cp mapred-site.xml ~/hadoop/etc/hadoop sudo cp yarn-site.xml ~/hadoop/etc/hadoop
(手动配置)进入Hadoop配置文件目录
cd ~/hadoop/etc/hadoop
sudo gedit core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/hadoop/tmp</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property>
sudo gedit hdfs-site.xml
<property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
默认不存在此文件,需要创建:
sudo cp mapred-site.xml.template mapred-site.xml
sudo
gedit mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property>
sudo gedit yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property>
修改slaves:添加slave1
sudo gedit slaves
复制配置到slave1上
cp2slave.sh #!/bin/bash scp–r /home/hadoop/hadoop-2.2.0 hadoop@slave1:~/
修改slave1中的hostname为slave1
测试
在master中启动hadoop
hdfs namenode –format start-dfs.sh start-yarn.sh
在master上运行jps命令可见到namenode、secondarynamenode、resourcemanager
在slave1上运行jps命令可见到datanode、nodemanager