Hadoop下载地址:
https://archive.apache.org/dist/hadoop/common/hadoop-3.2.1/
将hadoop-3.2.1.tar.gz 上传到/home/offcn/software
上传成功后,再进行解压:apps目录下
tar -zxvf hadoop-3.2.1.tar.gz -C …/apps/
配置hadoop的环境变量
在apps目录下:sudo vim /etc/profile
export HADOOP_HOME=/home/offcn/apps/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
为了确保配置文件生效
source /etc/profile 刷新一下
检测是否安装成功!(version 前面一定不要加-)
hadoop version
#注意:如果hadoop命令不能用可reboot尝试重启
修改配置文件
(1)配置Hadoop所使用Java的环境变量以及日志存储路径
/home/offcn/apps/hadoop-3.2.1/etc/hadoop 目录下
vim hadoop-env.sh命令
export JAVA_HOME=/home/offcn/apps/jdk1.8.0_144
export HADOOP_LOG_DIR=/home/offcn/logs/hadoop-3.2.1
export HDFS_NAMENODE_USER=offcn
export HDFS_DATANODE_USER=offcn
export HDFS_SECONDARYNAMENODE_USER=offcn
export YARN_RESOURCEMANAGER_USER=offcn
export YARN_NODEMANAGER_USER=offcn
(2)核心配置文件core-site.xml
vim core-site.xml 命令
文件内容如下:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://bd-offcn-01:8020</value>
</property>
<property>
<name>hadoop.data.dir</name>
<value>/home/offcn/data/hadoop-3.2.1</value>
</property>
<property>
<name>hadoop.proxyuser.offcn.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.offcn.groups</name>
<value>*</value>
</property>
</configuration>
(3)HDFS配置文件
配置hdfs-site.xml
命令 [offcn@bd-offcn-01 hadoop]$ vim hdfs-site.xml
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.data.dir}/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file://${hadoop.data.dir}/data</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file://${hadoop.data.dir}/namesecondary</value>
</property>
<property>
<name>dfs.client.datanode-restart.timeout</name>
<value>30</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>bd-offcn-03:9868</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
(4)YARN配置文件 配置yarn-site.xml
命令: [offcn@bd-offcn-01 hadoop]$ vim yarn-site.xml
文件内容如下:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bd-offcn-02</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
(5)MapReduce配置文件 配置mapred-site.xml
[offcn@bd-offcn-01 hadoop]$ vim mapred-site.xml
内容:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)配置works(去掉localhost,添加如下内容)
[offcn@bd-offcn-01 hadoop]$ vim workers
bd-offcn-01
bd-offcn-02
bd-offcn-03
(7)在集群上分发配置好的Hadoop配置文件 scp -r 远程发送
[offcn@bd-offcn-01 apps]$ scp -r /home/offcn/apps/hadoop-3.2.1 bd-offcn-02:$PWD
[offcn@bd-offcn-01 apps]$ scp -r /home/offcn/apps/hadoop-3.2.1 bd-offcn-03:$PWD
集群群起启动
启动HDFS
在第一台机器上 sbin/start-dfs.sh 命令
[offcn@bd-offcn-01 hadoop-3.2.1]$ sbin/start-dfs.sh 起服务
[offcn@bd-offcn-01 hadoop-3.2.1]$ sbin/stop-dfs.sh 停服务
启动YARN
在第二台机器上sbin/start-yarn.sh
[offcn@bd-offcn-02 hadoop-3.2.1]$ sbin/start-yarn.sh 启动yarn
[offcn@bd-offcn-02 hadoop-3.2.1]$ sbin/stop-yarn.sh 停止yarn