ResourceManager的HA它跟namenode的HA不通,没有namenode那么麻烦,它直接把ResourceManager的状态信息注册到zookeeper里面,如果直接访问standby的节点的时候,它会自动跳转到active的那台节点,当active节点宕机时, standby节点会自动变伪active
详细配置
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<!-- <property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop103</value>
</property> -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop102:19888/jobhistory/logs/</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
<!--启用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<!-- 给两台resourceManager起个名字 -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop103</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop104</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
配置完后,分发到103,104节点.我把两个resourcesManager配置到103,104节点了.
启动yarn
sbin/start-yarn.sh
可以看到在102节点集群集群yarn是启动不起来的,只能启动起来nodeManager
启动集群resourceManager只能在yarn配置的那台节点上进行启动,所以分别在103,104启动yarn
sbin/start-yarn.sh
进行测试,我这里其实是访问的104节点,它自动跳转到了103节点上,因为现在103是Active状态
然后把103节点resourceManager杀掉,它就自动跳转到104节点了