大数据技术AI
Flink/Spark/Hadoop/数仓,数据分析、面试,源码解读等干货学习资料
106篇原创内容
公众号
官网参考:https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/SecureMode.html
2、创建Hadoop系统用户
为Hadoop开启Kerberos,需为不同服务准备不同的用户,启动服务时需要使用相应的用户。须在所有节点创建以下用户和用户组。
User:Group | Daemons |
---|---|
hdfs:hadoop | NameNode, Secondary NameNode, JournalNode, DataNode |
yarn:hadoop | ResourceManager, NodeManager |
mapred:hadoop | MapReduce JobHistory Server |
创建hadoop组
[root@hadoop01 ~]# groupadd hadoop[root@hadoop02 ~]# groupadd hadoop[root@hadoop03 ~]# groupadd hadoop
创建各用户并设置密码
[root@hadoop01 ~]# useradd hdfs -g hadoop[root@hadoop01 ~]# echo hdfs | passwd --stdin hdfs[root@hadoop01 ~]# useradd yarn -g hadoop[root@hadoop01 ~]# echo yarn | passwd --stdin yarn[root@hadoop01 ~]# useradd mapred -g hadoop[root@hadoop01 ~]# echo mapred | passwd --stdin mapred[root@hadoop02 ~]# useradd hdfs -g hadoop[root@hadoop02 ~]# echo hdfs | passwd --stdin hdfs[root@hadoop02 ~]# useradd yarn -g hadoop[root@hadoop02 ~]# echo yarn | passwd --stdin yarn[root@hadoop02 ~]# useradd mapred -g hadoop[root@hadoop02 ~]# echo mapred | passwd --stdin mapred[root@hadoop03 ~]# useradd hdfs -g hadoop[root@hadoop03 ~]# echo hdfs | passwd --stdin hdfs[root@hadoop03 ~]# useradd yarn -g hadoop[root@hadoop03 ~]# echo yarn | passwd --stdin yarn[root@hadoop03 ~]# useradd mapred -g hadoop[root@hadoop03 ~]# echo mapred | passwd --stdin mapred
3、Hadoop Kerberos配置
======================
3.1 为Hadoop各服务创建Kerberos主体(Principal)
主体格式如下:ServiceName/HostName@REALM
如:dn/hadoop01@EXAMPLE.COM
- 各服务所需主体如下
环境:3台节点,主机名分别为hadoop01,hadoop02,hadoop03
服务 | 所在主机 | 主体(Principal) |
---|---|---|
NameNode | hadoop01 | nn/hadoop01 |
DataNode | hadoop01 | dn/hadoop01 |
DataNode | hadoop02 | dn/hadoop02 |
DataNode | hadoop03 | dn/hadoop03 |
Secondary NameNode | hadoop03 | sn/hadoop03 |
ResourceManager | hadoop02 | rm/hadoop02 |
NodeManager | hadoop01 | nm/hadoop01 |
NodeManager | hadoop02 | nm/hadoop02 |
NodeManager | hadoop03 | nm/hadoop03 |
JobHistory Server | hadoop01 | jhs/hadoop01 |
Web UI | hadoop01 | HTTP/hadoop01 |
Web UI | hadoop02 | HTTP/hadoop02 |
Web UI | hadoop03 | HTTP/hadoop03 |
- 创建主体说明
1)路径准备
为服务创建的主体,需要通过密钥文件keytab文件进行认证,故需为各服务准备一个安全的路径用来存储keytab文件。
[root@hadoop01 ~]# mkdir /etc/security/keytab/[root@hadoop01 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop01 ~]# chmod 770 /etc/security/keytab/
2)管理员主体认证
为执行创建主体的语句,需登录Kerberos 数据库客户端,登录之前需先使用Kerberos的管理员用户进行认证,执行以下命令并根据提示输入密码。
[root@hadoop01 ~]# kinit admin/admin
3)登录数据库客户端
[root@hadoop01 ~]# kadmin
4)执行创建主体的语句
kadmin: addprinc -randkey test/testkadmin: xst -k /etc/security/keytab/test.keytab test/test
说明:
(1)addprinc test/test:作用是新建主体
-
addprinc:增加主体
-
-randkey:密码随机,因hadoop各服务均通过keytab文件认证,故密码可随机生成
-
test/test:新增的主体
(2)xst -k /etc/security/keytab/test.keytab test/test:作用是将主体的密钥写入keytab文件
-
xst:将主体的密钥写入keytab文件
-
-k /etc/security/keytab/test.keytab:指明keytab文件路径和文件名
-
test/test:主体
(3)为方便创建主体,可使用如下命令
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey test/test"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/test.keytab test/test"
说明:
-p:主体
-w:密码
-q:执行语句
(4)操作主体的其他命令,可参考官方文档,地址如下:
http://web.mit.edu/kerberos/krb5-current/doc/admin/admin_commands/kadmin_local.html#commands
- 创建主体
1)在所有节点创建keytab文件目录
[root@hadoop01 ~]# mkdir /etc/security/keytab/[root@hadoop01 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop01 ~]# chmod 770 /etc/security/keytab/[root@hadoop02 ~]# mkdir /etc/security/keytab/[root@hadoop02 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop02 ~]# chmod 770 /etc/security/keytab/[root@hadoop03 ~]# mkdir /etc/security/keytab/[root@hadoop03 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop03 ~]# chmod 770 /etc/security/keytab/
2)以下命令在hadoop01节点执行
NameNode(hadoop01)
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey nn/hadoop01"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/nn.service.keytab nn/hadoop01"
DataNode(hadoop01)
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey dn/hadoop01"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/dn.service.keytab dn/hadoop01"
NodeManager(hadoop01)
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey nm/hadoop01"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/nm.service.keytab nm/hadoop01"
JobHistory Server(hadoop01)
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey jhs/hadoop01"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/jhs.service.keytab jhs/hadoop01"
Web UI(hadoop01)
[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey HTTP/hadoop01"[root@hadoop01 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/spnego.service.keytab HTTP/hadoop01"
3)以下命令在hadoop02执行
ResourceManager(hadoop02)
[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey rm/hadoop02"[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/rm.service.keytab rm/hadoop02"
DataNode(hadoop02)
[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey dn/hadoop02"[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/dn.service.keytab dn/hadoop02"
NodeManager(hadoop02)
[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey nm/hadoop02"[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/nm.service.keytab nm/hadoop02"
Web UI(hadoop02)
[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey HTTP/hadoop02"[root@hadoop02 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/spnego.service.keytab HTTP/hadoop02"
4)以下命令在hadoop03执行
DataNode(hadoop03)
[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey dn/hadoop03"[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/dn.service.keytab dn/hadoop03"
Secondary NameNode(hadoop03)
[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey sn/hadoop03"[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/sn.service.keytab sn/hadoop03"
NodeManager(hadoop03)
[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey nm/hadoop03"[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/nm.service.keytab nm/hadoop03"
Web UI(hadoop03)
[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"addprinc -randkey HTTP/hadoop03"[root@hadoop03 ~]# kadmin -padmin/admin -wadmin -q"xst -k /etc/security/keytab/spnego.service.keytab HTTP/hadoop03"
- 修改所有节点keytab文件的所有者和访问权限
[root@hadoop01 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop01 ~]# chmod 660 /etc/security/keytab/*[root@hadoop02 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop02 ~]# chmod 660 /etc/security/keytab/*[root@hadoop03 ~]# chown -R root:hadoop /etc/security/keytab/[root@hadoop03 ~]# chmod 660 /etc/security/keytab/*
3.2 修改Hadoop配置文件
- 官网案例
Hadoop SecureModel
需要修改的内容如下,修改完毕需要分发所改文件。
- core-site.xml
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/core-site.xml
增加以下内容
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/core-site.xml增加以下内容<!-- Kerberos主体到系统用户的映射机制 --><property> <name>hadoop.security.auth_to_local.mechanism</name> <value>MIT</value></property><!-- Kerberos主体到系统用户的具体映射规则 --><property> <name>hadoop.security.auth_to_local</name> <value> RULE:[2:$1/$2@$0]([ndj]n\/.*@EXAMPLE\.COM)s/.*/hdfs/ RULE:[2:$1/$2@$0]([rn]m\/.*@EXAMPLE\.COM)s/.*/yarn/ RULE:[2:$1/$2@$0](jhs\/.*@EXAMPLE\.COM)s/.*/mapred/ DEFAULT </value></property><!-- 启用Hadoop集群Kerberos安全认证 --><property> <name>hadoop.security.authentication</name> <value>kerberos</value></property><!-- 启用Hadoop集群授权管理 --><property> <name>hadoop.security.authorization</name> <value>true</value></property><!-- Hadoop集群间RPC通讯设为仅认证模式 --><property> <name>hadoop.rpc.protection</name> <value>authentication</value></property>
- hdfs-site.xml
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/hdfs-site.xml
增加以下内容
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/hdfs-site.xml增加以下内容<!-- 访问DataNode数据块时需通过Kerberos认证 --><property> <name>dfs.block.access.token.enable</name> <value>true</value></property><!-- NameNode服务的Kerberos主体,_HOST会自动解析为服务所在的主机名 --><property> <name>dfs.namenode.kerberos.principal</name> <value>nn/_HOST@EXAMPLE.COM</value></property><!-- NameNode服务的Kerberos密钥文件路径 --><property> <name>dfs.namenode.keytab.file</name> <value>/etc/security/keytab/nn.service.keytab</value></property><!-- Secondary NameNode服务的Kerberos主体 --><property> <name>dfs.secondary.namenode.keytab.file</name> <value>/etc/security/keytab/sn.service.keytab</value></property><!-- Secondary NameNode服务的Kerberos密钥文件路径 --><property> <name>dfs.secondary.namenode.kerberos.principal</name> <value>sn/_HOST@EXAMPLE.COM</value></property><!-- NameNode Web服务的Kerberos主体 --><property> <name>dfs.namenode.kerberos.internal.spnego.principal</name> <value>HTTP/_HOST@EXAMPLE.COM</value></property><!-- WebHDFS REST服务的Kerberos主体 --><property> <name>dfs.web.authentication.kerberos.principal</name> <value>HTTP/_HOST@EXAMPLE.COM</value></property><!-- Secondary NameNode Web UI服务的Kerberos主体 --><property> <name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name> <value>HTTP/_HOST@EXAMPLE.COM</value></property><!-- Hadoop Web UI的Kerberos密钥文件路径 --><property> <name>dfs.web.authentication.kerberos.keytab</name> <value>/etc/security/keytab/spnego.service.keytab</value></property><!-- DataNode服务的Kerberos主体 --><property> <name>dfs.datanode.kerberos.principal</name> <value>dn/_HOST@EXAMPLE.COM</value></property><!-- DataNode服务的Kerberos密钥文件路径 --><property> <name>dfs.datanode.keytab.file</name> <value>/etc/security/keytab/dn.service.keytab</value></property><!-- 配置NameNode Web UI 使用HTTPS协议 --><property> <name>dfs.http.policy</name> <value>HTTPS_ONLY</value></property><!-- 配置DataNode数据传输保护策略为仅认证模式 --><property> <name>dfs.data.transfer.protection</name> <value>authentication</value></property>
- yarn-site.xml
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/yarn-site.xml
增加以下内容
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/yarn-site.xml增加以下内容<!-- Resource Manager 服务的Kerberos主体 --><property> <name>yarn.resourcemanager.principal</name> <value>rm/_HOST@EXAMPLE.COM</value></property><!-- Resource Manager 服务的Kerberos密钥文件 --><property> <name>yarn.resourcemanager.keytab</name> <value>/etc/security/keytab/rm.service.keytab</value></property><!-- Node Manager 服务的Kerberos主体 --><property> <name>yarn.nodemanager.principal</name> <value>nm/_HOST@EXAMPLE.COM</value></property><!-- Node Manager 服务的Kerberos密钥文件 --><property> <name>yarn.nodemanager.keytab</name> <value>/etc/security/keytab/nm.service.keytab</value></property>
- mapred-site.xml
[root@hadoop01 ~]# vim /data/hadoop-3.1.3/etc/hadoop/mapred-site.xml
增加以下内容
<!-- 历史服务器的Kerberos主体 --><property> <name>mapreduce.jobhistory.keytab</name> <value>/etc/security/keytab/jhs.service.keytab</value></property><!-- 历史服务器的Kerberos密钥文件 --><property> <name>mapreduce.jobhistory.principal</name> <value>jhs/_HOST@EXAMPLE.COM</value></property>
(5)分发以上修改的配置文件
[root@hadoop01 ~]# xsync /data/hadoop-3.1.3/etc/hadoop/core-site.xml[root@hadoop01 ~]# xsync /data/hadoop-3.1.3/etc/hadoop/hdfs-site.xml[root@hadoop01 ~]# xsync /data/hadoop-3.1.3/etc/hadoop/yarn-site.xml[root@hadoop01 ~]# xsync /data/hadoop-3.1.3/etc/hadoop/mapred-site.xml
3.3 配置HDFS使用HTTPS安全传输协议
- 生成密钥对
Keytool是java数据证书的管理工具,使用户能够管理自己的公/私钥对及相关证书。
-
-keystore:指定密钥库的名称及位置(产生的各类信息将存在.keystore文件中)
-
-genkey(或者-genkeypair) :生成密钥对
-
-alias:为生成的密钥对指定别名,如果没有默认是mykey
-
-keyalg:指定密钥的算法 RSA/DSA 默认是DSA
1)生成 keystore的密码及相应信息的密钥库
[root@hadoop01 ~]# keytool -keystore /etc/security/keytab/keystore -alias jetty -genkey -keyalg RSA输入密钥库口令: 再次输入新口令: 您的名字与姓氏是什么? [Unknown]: 您的组织单位名称是什么? [Unknown]: 您的组织名称是什么? [Unknown]: 您所在的城市或区域名称是什么? [Unknown]: 您所在的省/市/自治区名称是什么? [Unknown]: 该单位的双字母国家/地区代码是什么? [Unknown]: CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown是否正确? [否]: y输入 <jetty> 的密钥口令 (如果和密钥库口令相同, 按回车): 再次输入新口令:
其中:公钥在证书中
2)修改keystore文件的所有者和访问权限
[root@hadoop01 ~]# chown -R root:hadoop /etc/security/keytab/keystore[root@hadoop01 ~]# chmod 660 /etc/security/keytab/keystore
注意:
(1)密钥库的密码至少6个字符,可以是纯数字或者字母或者数字和字母的组合等等
(2)确保hdfs用户(HDFS的启动用户)具有对所生成keystore文件的读权限
3)将该证书分发到集群中的每台节点的相同路径
[root@hadoop01 ~]# xsync /etc/security/keytab/keystore
4)修改hadoop配置文件ssl-server.xml.example
,该文件位于$HADOOP_HOME/etc/hadoop
目录,修改文件名为ssl-server.xml
[root@hadoop01 ~]# mv $HADOOP_HOME/etc/hadoop/ssl-server.xml.example $HADOOP_HOME/etc/hadoop/ssl-server.xml
修改以下内容
[root@hadoop01 ~]# vim $HADOOP_HOME/etc/hadoop/ssl-server.xml
修改以下参数
<!-- SSL密钥库路径 --><property> <name>ssl.server.keystore.location</name> <value>/etc/security/keytab/keystore</value></property><!-- SSL密钥库密码 --><property> <name>ssl.server.keystore.password</name> <value>123456</value></property><!-- SSL可信任密钥库路径 --><property> <name>ssl.server.truststore.location</name> <value>/etc/security/keytab/keystore</value></property><!-- SSL密钥库中密钥的密码 --><property> <name>ssl.server.keystore.keypassword</name> <value>123456</value></property><!-- SSL可信任密钥库密码 --><property> <name>ssl.server.truststore.password</name> <value>123456</value></property>
5)分发ssl-server.xml文件
[root@hadoop01 ~]# xsync $HADOOP_HOME/etc/hadoop/ssl-server.xml
3.4 配置Yarn使用LinuxContainerExecutor
1)修改所有节点的container-executor所有者和权限,要求其所有者为root,所有组为hadoop(启动NodeManger的yarn用户的所属组),权限为6050。其默认路径为$HADOOP_HOME/bin
[root@hadoop01 ~]# chown root:hadoop /data/hadoop-3.1.3/bin/container-executor[root@hadoop01 ~]# chmod 6050 /data/hadoop-3.1.3/bin/container-executor[root@hadoop02 ~]# chown root:hadoop /data/hadoop-3.1.3/bin/container-executor[root@hadoop02 ~]# chmod 6050 /data/hadoop-3.1.3/bin/container-executor[root@hadoop03 ~]# chown root:hadoop /data/hadoop-3.1.3/bin/container-executor[root@hadoop03 ~]# chmod 6050 /data/hadoop-3.1.3/bin/container-executor
2)修改所有节点的container-executor.cfg文件的所有者和权限,要求该文件及其所有的上级目录的所有者均为root,所有组为hadoop(启动NodeManger的yarn用户的所属组),权限为400。其默认路径为$HADOOP_HOME/etc/hadoop
[root@hadoop01 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg[root@hadoop01 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop[root@hadoop01 ~]# chown root:hadoop /data/hadoop-3.1.3/etc[root@hadoop01 ~]# chown root:hadoop /data/hadoop-3.1.3[root@hadoop01 ~]# chown root:hadoop /data[root@hadoop01 ~]# chmod 400 /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg[root@hadoop02 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg[root@hadoop02 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop[root@hadoop02 ~]# chown root:hadoop /data/hadoop-3.1.3/etc[root@hadoop02 ~]# chown root:hadoop /data/hadoop-3.1.3[root@hadoop02 ~]# chown root:hadoop /data[root@hadoop02 ~]# chmod 400 /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg[root@hadoop03 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg[root@hadoop03 ~]# chown root:hadoop /data/hadoop-3.1.3/etc/hadoop[root@hadoop03 ~]# chown root:hadoop /data/hadoop-3.1.3/etc[root@hadoop03 ~]# chown root:hadoop /data/hadoop-3.1.3[root@hadoop03 ~]# chown root:hadoop /data[root@hadoop03 ~]# chmod 400 /data/hadoop-3.1.3/etc/hadoop/container-executor.cfg
3)修改$HADOOP_HOME/etc/hadoop/container-executor.cfg
[root@hadoop01 ~]# vim $HADOOP_HOME/etc/hadoop/container-executor.cfg
内容如下
yarn.nodemanager.linux-container-executor.group=hadoopbanned.users=hdfs,yarn,mapredmin.user.id=1000allowed.system.users=feature.tc.enabled=false
- min.user.id:最小用户id,1000以下为系统用户
-
banned.users:禁用的用户
-
allowed.system.users:允许的系统用户
4)修改$HADOOP_HOME/etc/hadoop/yarn-site.xml文件
[root@hadoop01 ~]# vim $HADOOP_HOME/etc/hadoop/yarn-site.xml
增加以下内容
<!-- 配置Node Manager使用LinuxContainerExecutor管理Container --><property> <name>yarn.nodemanager.container-executor.class</name> <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value></property><!-- 配置Node Manager的启动用户的所属组 --><property> <name>yarn.nodemanager.linux-container-executor.group</name> <value>hadoop</value></property><!-- LinuxContainerExecutor脚本路径 --><property> <name>yarn.nodemanager.linux-container-executor.path</name> <value>/data/hadoop-3.1.3/bin/container-executor</value></property>
5)分发container-executor.cfg和yarn-site.xml文件
[root@hadoop01 ~]# xsync $HADOOP_HOME/etc/hadoop/container-executor.cfg[root@hadoop01 ~]# xsync $HADOOP_HOME/etc/hadoop/yarn-site.xml
4、安全模式下启动Hadoop集群
4.1 修改特定本地路径权限
local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x |
---|---|---|---|
local | dfs.namenode.name.dir | hdfs:hadoop | drwx------ |
local | dfs.datanode.data.dir | hdfs:hadoop | drwx------ |
local | dfs.namenode.checkpoint.dir | hdfs:hadoop | drwx------ |
local | yarn.nodemanager.local-dirs | yarn:hadoop | drwxrwxr-x |
local | yarn.nodemanager.log-dirs | yarn:hadoop | drwxrwxr-x |
4.1.1 $HADOOP_LOG_DIR(所有节点)
该变量位于hadoop-env.sh文件,默认值为 ${HADOOP_HOME}/logs
[root@hadoop01 ~]# chown hdfs:hadoop /data/hadoop-3.1.3/logs/[root@hadoop01 ~]# chmod 775 /data/hadoop-3.1.3/logs/[root@hadoop02 ~]# chown hdfs:hadoop /data/hadoop-3.1.3/logs/[root@hadoop02 ~]# chmod 775 /data/hadoop-3.1.3/logs/[root@hadoop03 ~]# chown hdfs:hadoop /data/hadoop-3.1.3/logs/[root@hadoop03 ~]# chmod 775 /data/hadoop-3.1.3/logs/
4.1.2 dfs.namenode.name.dir(NameNode节点)
该参数位于hdfs-site.xml文件,默认值为file://${hadoop.tmp.dir}/dfs/name
[root@hadoop01 ~]# chown -R hdfs:hadoop /data/hadoop-3.1.3/data/dfs/name/[root@hadoop01 ~]# chmod 700 /data/hadoop-3.1.3/data/dfs/name/
4.1.3 dfs.datanode.data.dir(DataNode节点)
该参数为于hdfs-site.xml文件,默认值为file://${hadoop.tmp.dir}/dfs/data
[root@hadoop01 ~]# chown -R hdfs:hadoop /data/hadoop-3.1.3/data/dfs/data/[root@hadoop01 ~]# chmod 700 /data/hadoop-3.1.3/data/dfs/data/[root@hadoop02 ~]# chown -R hdfs:hadoop /data/hadoop-3.1.3/data/dfs/data/[root@hadoop02 ~]# chmod 700 /data/hadoop-3.1.3/data/dfs/data/[root@hadoop03 ~]# chown -R hdfs:hadoop /data/hadoop-3.1.3/data/dfs/data/[root@hadoop03 ~]# chmod 700 /data/hadoop-3.1.3/data/dfs/data/
4.1.4 dfs.namenode.checkpoint.dir(SecondaryNameNode节点)
该参数位于hdfs-site.xml文件,默认值为file://${hadoop.tmp.dir}/dfs/namesecondary
[root@hadoop03 ~]# chown -R hdfs:hadoop /data/hadoop-3.1.3/data/dfs/namesecondary/[root@hadoop03 ~]# chmod 700 /data/hadoop-3.1.3/data/dfs/namesecondary/
4.1.5 yarn.nodemanager.local-dirs(NodeManager节点)
该参数位于yarn-site.xml文件,默认值为file://${hadoop.tmp.dir}/nm-local-dir
[root@hadoop01 ~]# chown -R yarn:hadoop /data/hadoop-3.1.3/data/nm-local-dir/[root@hadoop01 ~]# chmod -R 775 /data/hadoop-3.1.3/data/nm-local-dir/[root@hadoop02 ~]# chown -R yarn:hadoop /data/hadoop-3.1.3/data/nm-local-dir/[root@hadoop02 ~]# chmod -R 775 /data/hadoop-3.1.3/data/nm-local-dir/[root@hadoop03 ~]# chown -R yarn:hadoop /data/hadoop-3.1.3/data/nm-local-dir/[root@hadoop03 ~]# chmod -R 775 /data/hadoop-3.1.3/data/nm-local-dir/
4.1.6 yarn.nodemanager.log-dirs(NodeManager节点)
该参数位于yarn-site.xml
文件,默认值为$HADOOP_LOG_DIR/userlogs
[root@hadoop01 ~]# chown yarn:hadoop /data/hadoop-3.1.3/logs/userlogs/[root@hadoop01 ~]# chmod 775 /data/hadoop-3.1.3/logs/userlogs/[root@hadoop02 ~]# chown yarn:hadoop /data/hadoop-3.1.3/logs/userlogs/[root@hadoop02 ~]# chmod 775 /data/hadoop-3.1.3/logs/userlogs/[root@hadoop03 ~]# chown yarn:hadoop /data/hadoop-3.1.3/logs/userlogs/[root@hadoop03 ~]# chmod 775 /data/hadoop-3.1.3/logs/userlogs/
4.2 启动HDFS
需要注意的是,启动不同服务时需要使用对应的用户
- 单点启动
(1)启动NameNode
[root@hadoop01 ~]# sudo -i -u hdfs hdfs --daemon start namenode
(2)启动DataNode
[root@hadoop01 ~]# sudo -i -u hdfs hdfs --daemon start datanode[root@hadoop02 ~]# sudo -i -u hdfs hdfs --daemon start datanode[root@hadoop03 ~]# sudo -i -u hdfs hdfs --daemon start datanode
(3)启动SecondaryNameNode
[root@hadoop03 ~]# sudo -i -u hdfs hdfs --daemon start secondarynamenode
说明:
-
-i:重新加载环境变量
-
-u:以特定用户的身份执行后续命令
- 群起
1)在主节点(hadoop01)配置hdfs用户到所有节点的免密登录。
2)修改主节点(hadoop01)节点的$HADOOP_HOME/sbin/start-dfs.sh
脚本,在顶部增加以下环境变量。
[root@hadoop01 ~]# vim $HADOOP_HOME/sbin/start-dfs.sh
在顶部增加如下内容
HDFS_DATANODE_USER=hdfsHDFS_NAMENODE_USER=hdfsHDFS_SECONDARYNAMENODE_USER=hdfs
注:$HADOOP_HOME/sbin/stop-dfs.sh
也需在顶部增加上述环境变量才可使用。
3)以root用户执行群起脚本,即可启动HDFS集群。
[root@hadoop01 ~]# start-dfs.sh
- 查看HFDS web页面
访问地址为https://hadoop01:9871
还没有认证,不能访问文件
4.3 修改HDFS特定路径访问权限
hdfs | / | hdfs:hadoop | drwxr-xr-x |
---|---|---|---|
hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt |
hdfs | /user | hdfs:hadoop | drwxrwxr-x |
hdfs | yarn.nodemanager.remote-app-log-dir | yarn:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.intermediate-done-dir | mapred:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.done-dir | mapred:hadoop | drwxrwx— |
说明:
若上述路径不存在,需手动创建
1)创建hdfs/hadoop主体,执行以下命令并按照提示输入密码
[root@hadoop01 ~]# kadmin.local -q "addprinc hdfs/hadoop"
2)认证hdfs/hadoop主体,执行以下命令并按照提示输入密码
[root@hadoop01 ~]# kinit hdfs/hadoop
3)按照上述要求修改指定路径的所有者和权限
(1)修改/、/tmp、/user
路径
[root@hadoop01 ~]# hadoop fs -chown hdfs:hadoop / /tmp /user[root@hadoop01 ~]# hadoop fs -chmod 755 /[root@hadoop01 ~]# hadoop fs -chmod 1777 /tmp[root@hadoop01 ~]# hadoop fs -chmod 775 /user
(2)参数yarn.nodemanager.remote-app-log-dir
位于yarn-site.xml文件,默认值/tmp/logs
[root@hadoop01 ~]# hadoop fs -chown yarn:hadoop /tmp/logs[root@hadoop01 ~]# hadoop fs -chmod 1777 /tmp/logs
(3)参数mapreduce.jobhistory.intermediate-done-dir
位于mapred-site.xml文件,默认值为/tmp/hadoop-yarn/staging/history/done_intermediate
,需保证该路径的所有上级目录(除/tmp)的所有者均为mapred,所属组为hadoop,权限为770
[root@hadoop01 ~]# hadoop fs -chown -R mapred:hadoop /tmp/hadoop-yarn/staging/history/done_intermediate[root@hadoop01 ~]# hadoop fs -chmod -R 1777 /tmp/hadoop-yarn/staging/history/done_intermediate[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/staging/history/[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/staging/[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/staging/history/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/staging/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/
(4)参数mapreduce.jobhistory.done-dir
位于mapred-site.xml文件,默认值为/tmp/hadoop-yarn/staging/history/done
,需保证该路径的所有上级目录(除/tmp)的所有者均为mapred,所属组为hadoop,权限为770
[root@hadoop01 ~]# hadoop fs -chown -R mapred:hadoop /tmp/hadoop-yarn/staging/history/done[root@hadoop01 ~]# hadoop fs -chmod -R 750 /tmp/hadoop-yarn/staging/history/done[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/staging/history/[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/staging/[root@hadoop01 ~]# hadoop fs -chown mapred:hadoop /tmp/hadoop-yarn/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/staging/history/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/staging/[root@hadoop01 ~]# hadoop fs -chmod 770 /tmp/hadoop-yarn/
4.4 启动Yarn
1.单点启动
启动ResourceManager
[root@hadoop02 ~]# sudo -i -u yarn yarn --daemon start resourcemanager
启动NodeManager
[root@hadoop01 ~]# sudo -i -u yarn yarn --daemon start nodemanager[root@hadoop02 ~]# sudo -i -u yarn yarn --daemon start nodemanager[root@hadoop03 ~]# sudo -i -u yarn yarn --daemon start nodemanager
2.群起
1)在Yarn主节点(hadoop02)配置yarn用户到所有节点的免密登录。
2)修改主节点(hadoop02)的$HADOOP_HOME/sbin/start-yarn.sh
,在顶部增加以下环境变量。
[root@hadoop02 ~]# vim $HADOOP_HOME/sbin/start-yarn.sh
在顶部增加如下内容
YARN_RESOURCEMANAGER_USER=yarnYARN_NODEMANAGER_USER=yarn
注:stop-yarn.sh也需在顶部增加上述环境变量才可使用。
3)以root用户执行$HADOOP_HOME/sbin/start-yarn.sh
脚本即可启动yarn集群。
[root@hadoop02 ~]# start-yarn.sh
3.访问Yarn web页面
访问地址为http://hadoop02:8088
4.5 启动HistoryServer
1.启动历史服务器
[root@hadoop01 ~]# sudo -i -u mapred mapred --daemon start historyserver
2.查看历史服务器web页面
访问地址为http://hadoop01:19888