为了保证HBASE是高可用的,所依赖的HDFS和zookeeper也要是高可用的.
有关于Hadoop HA高可用可参考: https://blog.csdn.net/weixin_44455125/article/details/122524147
通过参数hbase.rootdir指定了连接到Hadoop的地址,mycluster表示为Hadoop的集群.
HBASE本身的高可用很简单,只要在一个健康的集群其他节点通过命令hbase-daemon.sh start master
启动一个Hmaster进程,这个Hmaster会自动成为backupMaster.
下载安装
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/2.4.9/hbase-2.4.9-bin.tar.gz --no-check-certificate
tar xf hbase-2.4.9-bin.tar.gz -C /opt/
修改配置
cd /opt/hbase-2.4.9/conf/
vim regionservers
emr-worker-01
emr-worker-02
echo 'export HBASE_HOME=/opt/hbase-2.4.9/' >>/etc/profile
echo 'export PATH=${PATH}:${HBASE_HOME}/bin' >>/etc/profile
source /etc/profile
hbase-env.sh
#!/usr/bin/env bash
source /etc/profile
#
#/**
# * Licensed to the Apache Software Foundation (ASF) under one
hbase-site.xml
<configuration>
<!--
The following properties are set for running HBase as a single process on a
developer workstation. With this configuration, HBase is running in
"stand-alone" mode and without a distributed file system. In this mode, and
without further configuration, HBase and ZooKeeper data are stored on the
local filesystem, in a path under the value configured for `hbase.tmp.dir`.
This value is overridden from its default value of `/tmp` because many
systems clean `/tmp` on a regular basis. Instead, it points to a path within
this HBase installation directory.
Running against the `LocalFileSystem`, as opposed to a distributed
filesystem, runs the risk of data integrity issues and data loss. Normally
HBase will refuse to run in such an environment. Setting
`hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior,
permitting operation. This configuration is for the developer workstation
only and __should not be used in production!__
See also https://hbase.apache.org/book.html#standalone_dist
-->
<property>
<name>hbase.cluster.distributed</name>
<value>false</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>./tmp</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<!-- hbase数据存放在Hadoop路径的位置 -->
<name>hbase.rootdir</name>
<value>hdfs://mycluster:8020/HBase</value>
</property>
<property>
<!-- true表示开启集群 -->
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<!-- 0.89后的变动 之前没有.port 默认端口1600 0 -->
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<!-- 连接的zookeeper地址 -->
<name>hbase.zookeeper.quorum</name>
<value>emr-header-01,emr-header-02,emr-worker-01</value>
</property>
<property>
<!-- -->
<name>HBase.zookeeper.property.dataDir</name>
<value>/data/zookeeper</value>
</property>
</configuration>
scp /etc/profile emr-header-02:/etc/
scp /etc/profile emr-worker-01:/etc/
scp /etc/profile emr-worker-02:/etc/
scp -r /opt/hbase-2.4.9 emr-header-02:/opt/
scp -r /opt/hbase-2.4.9 emr-worker-01:/opt/
scp -r /opt/hbase-2.4.9 emr-worker-02:/opt/
emr-header-02节点执行
source /etc/profile
hbase-daemon.sh start master
访问emr-header-02节点
http://emr-header-01:16010/master-status
hbase默认只有一个master节点,而如果想要做高可用,只需要在其他节点上重新启动一个Hmaster,则会默认会成为备用的节点.
Backup Masters中ServerName有一个emr-header-02,即为成功.接下来,就能直接shutdown掉主节点测试检查,看是否会自动切换成主节点了.
如果不想每次启动都手动在从节点启动backupHmaster进程,可以加入到启动脚本中:
cd /opt/hbase-2.4.9/bin/
vim start-hbase.sh
文件最下面的if中加入:
if [ "$distMode" == 'false' ]
then
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" $commandToRun master
else
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" $commandToRun zookeeper
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" $commandToRun master
# 加入此行 启动时ssh连接到emr-header-02节点启动Hmaster进程
ssh emr-header-02 "$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" $commandToRun master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_REGIONSERVERS}" $commandToRun regionserver
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" $commandToRun master-backup