YARN-HA配置
1. YARN-HA工作机制
1.1 官方文档:http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
1.2 YARN-HA工作机制,如图3-23所示
2. 配置YARN-HA集群
2.1 规划集群配置
hadoop102 | hadoop103 | hadoop104 |
---|---|---|
NameNode | NameNode | |
JournalNode | JournalNode | JournalNode |
DataNode | DataNode | DataNode |
ZK | ZK | ZK |
ResourceManager | ResourceManager | |
NodeManager | NodeManager | NodeManager |
2.2 具体配置
2.2.1 修改yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop102</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop103</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
2.2.2 同步其他集群机器上的配置
xsync etc
3. 启动HDFS集群
3.1 在各个[nn1]节点上,输入以下命令启动服务:
sbin/start-dfs.sh
3.2 启动yarn , 即便配置了YARN-HA,他仍然只会在一台机器上启动ResourceManager,其他的需要在其他机器上手动启动
sbin/start-yarn.sh
3.3 如何查看YARN_HA有没有启动成功?有两种方式:
(1)访问:Hadoop103:8088 ,会被自动切换到Hadoop102:8088
(2)通过命令的方式查看它的状态:
[simon@hadoop102 hadoop-2.7.2]$ bin/yarn rmadmin -getServiceState rm1
active
[simon@hadoop103 hadoop-2.7.2]$ bin/yarn rmadmin -getServiceState rm2
standby