centos7下安装hadoop(单机版)

出处(官方文档)

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Download

下载(官方)

https://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/

hadoop-3.3.1.tar.gz

配置环境(重要)

查看jdk安装目录

cat /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-1.el7_9.x86_64

设置ssh免密

ssh localhost
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

修改host(linux)文件

说明:如果不修改的话,使用页面进行操作的时候会报错

[root@node bin]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.31.252 node node.hadoop.cn

修改本地(windows)host文件

192.168.31.252 node

安装与配置

## 创建文件夹子并且解压压缩包
cd /opt/
mkdir hadoop
cd hadoop/
ll
# hadoop-3.3.1.tar.gz
tar -zxvf hadoop-3.3.1.tar.gz 
cd hadoop-3.3.1/etc/hadoop/

全局配置

vim hadoop-env.sh
# 最后新增一行
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-1.el7_9.x86_64

后续测试使用

cd /opt/hadoop/hadoop-3.3.1
mkdir input
cp etc/hadoop/*.xml input
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input output 'dfs[a-z.]+'
cat output/*

hdfs

vim etc/hadoop/core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://192.168.31.252:9000</value>
    </property>
</configuration>
vim etc/hadoop/hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

初次启动先配初始化

bin/hdfs namenode -format

启动

sbin/start-dfs.sh

报错

Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [template.com]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.

解决

在/hadoop/sbin路径下:
将start-dfs.sh,stop-dfs.sh两个文件顶部添加以下参数

#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

还有,start-yarn.sh,stop-yarn.sh顶部也需添加以下:

#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root


# Licensed to the Apache Software Foundation (ASF) under one or more

测试

$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
$ bin/hdfs dfs -mkdir input
$ bin/hdfs dfs -put etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.1.jar grep input output 'dfs[a-z.]+'
$ bin/hdfs dfs -get output output
$ cat output/*
# 或者
$ bin/hdfs dfs -cat output/*
# 停止
$ sbin/stop-dfs.sh

web页面

http://localhost:9870/

页面操作需要赋予权限

cd /opt/hadoop/hadoop-3.3.1/bin
hadoop fs -chmod -R 777 /user

yarn

vim etc/hadoop/mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
    </property>
</configuration>
vim etc/hadoop/yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
    </property>
</configuration>

启动yarn

sbin/start-yarn.sh

启动全部

sbin/start-all.sh

备注

使用页面操作时需要给路径赋予权限

cd /opt/hadoop/hadoop-3.3.1/bin
hadoop fs -chmod -R 777 /user

hadoop常见面试题

上一篇:杨森翔书法;晨练


下一篇:Hadoop常用操作