一、hbase出现的问题
1.元数据表hbase:namespace 不在线
导致查询数据时 master is initing
2.一些表的region一直处于opening状态
3.region is not online
二、修复工具编译
git地址
https://github.com/apache/hbase-operator-tools 下载到idea
我使用的hdp的3.1.78,hbase为2.0.2,这个工具就不支持这版本
2.0.2的 hbase-server 的Hbck里面是 没有assigns等其他的方法
我这边是使用了从git拉下来的版本(hbase 2.4.7)直接编译的,需要在pom文件里面修改一下依赖作用域
去掉scope的 provide 打一个 胖包(原因是我的集群版本的都没有这些方法,执行命令去找 hbase classpath的hbase-server相关的jar 绝对会报错)
编译好在target,上传到hbase的机器上
三、工具安装
执行hbase命令,看到hbase2版本 要运行的命令是这样
hbase hbck -j /opt/software/hbase-hbck2-1.2.0-SNAPSHOT.jar
按照HBCK2 的readme文件
为了方便我直接加入hbase的环境变量
hdp版本的hbase命令环境变量
vim /bin/hbase,加到class_path后面
执行 hbase classpath 查看一下有没有加进去
4.工具介绍
直接执行 hbase org.apache.hbase.HBCK2
下面是可选的项
usage: HBCK2 [OPTIONS] COMMAND <ARGS> Options: -d,--debug run with debug output -h,--help output this help message -p,--hbase.zookeeper.property.clientPort <arg> port of hbase ensemble -q,--hbase.zookeeper.quorum <arg> hbase ensemble -s,--skip skip hbase version check (PleaseHoldException) -v,--version this hbck2 version -z,--zookeeper.znode.parent <arg> parent znode of hbase ensemble Command: addFsRegionsMissingInMeta <NAMESPACE|NAMESPACE:TABLENAME>...
# 用hbase:meta元数据表 regions 丢失但是,hdfs的该表的目录还在
命令例子
hbase org.apache.hbase.HBCK2 addFsRegionsMissingInMeta default:tbl_1 n1:tbl_2 n2
assigns [OPTIONS] <ENCODED_REGIONNAME/INPUTFILES_FOR_REGIONNAMES>... Options: -o,--override override ownership by another procedure -i,--inputFiles take one or more files of encoded region names A 'raw' assign that can be used even during Master initialization (if the -skip flag is specified). Skirts Coprocessors. Pass one or more encoded region names. 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a user-space encoded region name looks like. For example: $ HBCK2 assigns 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created AssignProcedure(s) or -1 if none. If -i or --inputFiles is specified, pass one or more input file names. Each file contains encoded region names, one per line. For example: $ HBCK2 assigns -i fileName1 fileName2 bypass [OPTIONS] <PID>... Options: -o,--override override if procedure is running/stuck -r,--recursive bypass parent and its children. SLOW! EXPENSIVE! -w,--lockWait milliseconds to wait before giving up; default=1 Pass one (or more) procedure 'pid's to skip to procedure finish. Parent of bypassed procedure will also be skipped to the finish. Entities will be left in an inconsistent state and will require manual fixup. May need Master restart to clear locks still held. Bypass fails if procedure has children. Add 'recursive' if all you have is a parent pid to finish parent and children. This is SLOW, and dangerous so use selectively. Does not always work. extraRegionsInMeta <NAMESPACE|NAMESPACE:TABLENAME>... Options: -f, --fix fix meta by removing all extra regions found. Reports regions present on hbase:meta, but with no related directories on the file system. Needs hbase:meta to be online. For each table name passed as parameter, performs diff between regions available in hbase:meta and region dirs on the given file system. Extra regions would get deleted from Meta if passed the --fix option. NOTE: Before deciding on use the "--fix" option, it's worth check if reported extra regions are overlapping with existing valid regions. If so, then "extraRegionsInMeta --fix" is indeed the optimal solution. Otherwise, "assigns" command is the simpler solution, as it recreates regions dirs in the filesystem, if not existing. An example triggering extra regions report for tables 'table_1' and 'table_2', under default namespace: $ HBCK2 extraRegionsInMeta default:table_1 default:table_2 An example triggering missing regions report for table 'table_1' under default namespace, and for all tables from namespace 'ns1': $ HBCK2 extraRegionsInMeta default:table_1 ns1 Returns list of extra regions for each table passed as parameter, or for each table on namespaces specified as parameter. filesystem [OPTIONS] [<TABLENAME>...] Options: -f, --fix sideline corrupt hfiles, bad links, and references. Report on corrupt hfiles, references, broken links, and integrity. Pass '--fix' to sideline corrupt files and links. '--fix' does NOT fix integrity issues; i.e. 'holes' or 'orphan' regions. Pass one or more tablenames to narrow checkup. Default checks all tables and restores 'hbase.version' if missing. Interacts with the filesystem only! Modified regions need to be reopened to pick-up changes. fixMeta Do a server-side fix of bad or inconsistent state in hbase:meta. Available in hbase 2.2.1/2.1.6 or newer versions. Master UI has matching, new 'HBCK Report' tab that dumps reports generated by most recent run of _catalogjanitor_ and a new 'HBCK Chore'. It is critical that hbase:meta first be made healthy before making any other repairs. Fixes 'holes', 'overlaps', etc., creating (empty) region directories in HDFS to match regions added to hbase:meta. Command is NOT the same as the old _hbck1_ command named similarily. Works against the reports generated by the last catalog_janitor and hbck chore runs. If nothing to fix, run is a noop. Otherwise, if 'HBCK Report' UI reports problems, a run of fixMeta will clear up hbase:meta issues. See 'HBase HBCK' UI for how to generate new execute. SEE ALSO: reportMissingRegionsInMeta generateMissingTableDescriptorFile <TABLENAME> Trying to fix an orphan table by generating a missing table descriptor file. This command will have no effect if the table folder is missing or if the .tableinfo is present (we don't override existing table descriptors). This command will first check it the TableDescriptor is cached in HBase Master in which case it will recover the .tableinfo accordingly. If TableDescriptor is not cached in master then it will create a default .tableinfo file with the following items: - the table name - the column family list determined based on the file system - the default properties for both TableDescriptor and ColumnFamilyDescriptors If the .tableinfo file was generated using default parameters then make sure you check the table / column family properties later (and change them if needed). This method does not change anything in HBase, only writes the new .tableinfo file to the file system. Orphan tables can cause e.g. ServerCrashProcedures to stuck, you might need to fix these still after you generated the missing table info files. replication [OPTIONS] [<TABLENAME>...] Options: -f, --fix fix any replication issues found. Looks for undeleted replication queues and deletes them if passed the '--fix' option. Pass a table name to check for replication barrier and purge if '--fix'. reportMissingRegionsInMeta <NAMESPACE|NAMESPACE:TABLENAME>... To be used when regions missing from hbase:meta but directories are present still in HDFS. Can happen if user has run _hbck1_ 'OfflineMetaRepair' against an hbase-2.x cluster. This is a CHECK only method, designed for reporting purposes and doesn't perform any fixes, providing a view of which regions (if any) would get re-added to hbase:meta, grouped by respective table/namespace. To effectively re-add regions in meta, run addFsRegionsMissingInMeta. This command needs hbase:meta to be online. For each namespace/table passed as parameter, it performs a diff between regions available in hbase:meta against existing regions dirs on HDFS. Region dirs with no matches are printed grouped under its related table name. Tables with no missing regions will show a 'no missing regions' message. If no namespace or table is specified, it will verify all existing regions. It accepts a combination of multiple namespace and tables. Table names should include the namespace portion, even for tables in the default namespace, otherwise it will assume as a namespace value. An example triggering missing regions execute for tables 'table_1' and 'table_2', under default namespace: $ HBCK2 reportMissingRegionsInMeta default:table_1 default:table_2 An example triggering missing regions execute for table 'table_1' under default namespace, and for all tables from namespace 'ns1': $ HBCK2 reportMissingRegionsInMeta default:table_1 ns1 Returns list of missing regions for each table passed as parameter, or for each table on namespaces specified as parameter. setRegionState <ENCODED_REGIONNAME> <STATE> Possible region states: OFFLINE, OPENING, OPEN, CLOSING, CLOSED, SPLITTING, SPLIT, FAILED_OPEN, FAILED_CLOSE, MERGING, MERGED, SPLITTING_NEW, MERGING_NEW, ABNORMALLY_CLOSED WARNING: This is a very risky option intended for use as last resort. Example scenarios include unassigns/assigns that can't move forward because region is in an inconsistent state in 'hbase:meta'. For example, the 'unassigns' command can only proceed if passed a region in one of the following states: SPLITTING|SPLIT|MERGING|OPEN|CLOSING Before manually setting a region state with this command, please certify that this region is not being handled by a running procedure, such as 'assign' or 'split'. You can get a view of running procedures in the hbase shell using the 'list_procedures' command. An example setting region 'de00010733901a05f5a2a3a382e27dd4' to CLOSING: $ HBCK2 setRegionState de00010733901a05f5a2a3a382e27dd4 CLOSING Returns "0" if region state changed and "1" otherwise. setTableState <TABLENAME> <STATE> Possible table states: ENABLED, DISABLED, DISABLING, ENABLING To read current table state, in the hbase shell run: hbase> get 'hbase:meta', '<TABLENAME>', 'table:state' A value of \x08\x00 == ENABLED, \x08\x01 == DISABLED, etc. Can also run a 'describe "<TABLENAME>"' at the shell prompt. An example making table name 'user' ENABLED: $ HBCK2 setTableState users ENABLED Returns whatever the previous table state was. scheduleRecoveries <SERVERNAME>... Schedule ServerCrashProcedure(SCP) for list of RegionServers. Format server name as '<HOSTNAME>,<PORT>,<STARTCODE>' (See HBase UI/logs). Example using RegionServer 'a.example.org,29100,1540348649479': $ HBCK2 scheduleRecoveries a.example.org,29100,1540348649479 Returns the pid(s) of the created ServerCrashProcedure(s) or -1 if no procedure created (see master logs for why not). Command support added in hbase versions 2.0.3, 2.1.2, 2.2.0 or newer. recoverUnknown Schedule ServerCrashProcedure(SCP) for RegionServers that are reported as unknown. Returns the pid(s) of the created ServerCrashProcedure(s) or -1 if no procedure created (see master logs for why not). Command support added in hbase versions 2.2.7, 2.3.5, 2.4.3, 2.5.0 or newer. unassigns <ENCODED_REGIONNAME>... Options: -o,--override override ownership by another procedure A 'raw' unassign that can be used even during Master initialization (if the -skip flag is specified). Skirts Coprocessors. Pass one or more encoded region names. 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a userspace encoded region name looks like. For example: $ HBCK2 unassigns 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created UnassignProcedure(s) or -1 if none. SEE ALSO, org.apache.hbase.hbck1.OfflineMetaRepair, the offline hbase:meta tool. See the HBCK2 README for how to use.
usage: HBCK2 [OPTIONS] COMMAND <ARGS>Options: -d,--debug run with debug output -h,--help output this help message -p,--hbase.zookeeper.property.clientPort <arg> port of hbase ensemble -q,--hbase.zookeeper.quorum <arg> hbase ensemble -s,--skip skip hbase version check (PleaseHoldException) -v,--version this hbck2 version -z,--zookeeper.znode.parent <arg> parent znode of hbase ensembleCommand: addFsRegionsMissingInMeta <NAMESPACE|NAMESPACE:TABLENAME>... Options: -d,--force_disable aborts fix for table if disable fails. To be used when regions missing from hbase:meta but directories are present still in HDFS. Can happen if user has run _hbck1_ 'OfflineMetaRepair' against an hbase-2.x cluster. Needs hbase:meta to be online. For each table name passed as parameter, performs diff between regions available in hbase:meta and region dirs on HDFS. Then for dirs with no hbase:meta matches, it reads the 'regioninfo' metadata file and re-creates given region in hbase:meta. Regions are re-created in 'CLOSED' state in the hbase:meta table, but not in the Masters' cache, and they are not assigned either. To get these regions online, run the HBCK2 'assigns'command printed when this command-run completes. NOTE: If using hbase releases older than 2.3.0, a rolling restart of HMasters is needed prior to executing the set of 'assigns' output. An example adding missing regions for tables 'tbl_1' in the default namespace, 'tbl_2' in namespace 'n1' and for all tables from namespace 'n2': $ HBCK2 addFsRegionsMissingInMeta default:tbl_1 n1:tbl_2 n2 Returns HBCK2 an 'assigns' command with all re-inserted regions. SEE ALSO: reportMissingRegionsInMeta SEE ALSO: fixMeta
assigns [OPTIONS] <ENCODED_REGIONNAME/INPUTFILES_FOR_REGIONNAMES>... Options: -o,--override override ownership by another procedure -i,--inputFiles take one or more files of encoded region names A 'raw' assign that can be used even during Master initialization (if the -skip flag is specified). Skirts Coprocessors. Pass one or more encoded region names. 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a user-space encoded region name looks like. For example: $ HBCK2 assigns 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created AssignProcedure(s) or -1 if none. If -i or --inputFiles is specified, pass one or more input file names. Each file contains encoded region names, one per line. For example: $ HBCK2 assigns -i fileName1 fileName2
bypass [OPTIONS] <PID>... Options: -o,--override override if procedure is running/stuck -r,--recursive bypass parent and its children. SLOW! EXPENSIVE! -w,--lockWait milliseconds to wait before giving up; default=1 Pass one (or more) procedure 'pid's to skip to procedure finish. Parent of bypassed procedure will also be skipped to the finish. Entities will be left in an inconsistent state and will require manual fixup. May need Master restart to clear locks still held. Bypass fails if procedure has children. Add 'recursive' if all you have is a parent pid to finish parent and children. This is SLOW, and dangerous so use selectively. Does not always work.
extraRegionsInMeta <NAMESPACE|NAMESPACE:TABLENAME>... Options: -f, --fix fix meta by removing all extra regions found. Reports regions present on hbase:meta, but with no related directories on the file system. Needs hbase:meta to be online. For each table name passed as parameter, performs diff between regions available in hbase:meta and region dirs on the given file system. Extra regions would get deleted from Meta if passed the --fix option. NOTE: Before deciding on use the "--fix" option, it's worth check if reported extra regions are overlapping with existing valid regions. If so, then "extraRegionsInMeta --fix" is indeed the optimal solution. Otherwise, "assigns" command is the simpler solution, as it recreates regions dirs in the filesystem, if not existing. An example triggering extra regions report for tables 'table_1' and 'table_2', under default namespace: $ HBCK2 extraRegionsInMeta default:table_1 default:table_2 An example triggering missing regions report for table 'table_1' under default namespace, and for all tables from namespace 'ns1': $ HBCK2 extraRegionsInMeta default:table_1 ns1 Returns list of extra regions for each table passed as parameter, or for each table on namespaces specified as parameter.
filesystem [OPTIONS] [<TABLENAME>...] Options: -f, --fix sideline corrupt hfiles, bad links, and references. Report on corrupt hfiles, references, broken links, and integrity. Pass '--fix' to sideline corrupt files and links. '--fix' does NOT fix integrity issues; i.e. 'holes' or 'orphan' regions. Pass one or more tablenames to narrow checkup. Default checks all tables and restores 'hbase.version' if missing. Interacts with the filesystem only! Modified regions need to be reopened to pick-up changes.
fixMeta Do a server-side fix of bad or inconsistent state in hbase:meta. Available in hbase 2.2.1/2.1.6 or newer versions. Master UI has matching, new 'HBCK Report' tab that dumps reports generated by most recent run of _catalogjanitor_ and a new 'HBCK Chore'. It is critical that hbase:meta first be made healthy before making any other repairs. Fixes 'holes', 'overlaps', etc., creating (empty) region directories in HDFS to match regions added to hbase:meta. Command is NOT the same as the old _hbck1_ command named similarily. Works against the reports generated by the last catalog_janitor and hbck chore runs. If nothing to fix, run is a noop. Otherwise, if 'HBCK Report' UI reports problems, a run of fixMeta will clear up hbase:meta issues. See 'HBase HBCK' UI for how to generate new execute. SEE ALSO: reportMissingRegionsInMeta
generateMissingTableDescriptorFile <TABLENAME> Trying to fix an orphan table by generating a missing table descriptor file. This command will have no effect if the table folder is missing or if the .tableinfo is present (we don't override existing table descriptors). This command will first check it the TableDescriptor is cached in HBase Master in which case it will recover the .tableinfo accordingly. If TableDescriptor is not cached in master then it will create a default .tableinfo file with the following items: - the table name - the column family list determined based on the file system - the default properties for both TableDescriptor and ColumnFamilyDescriptors If the .tableinfo file was generated using default parameters then make sure you check the table / column family properties later (and change them if needed). This method does not change anything in HBase, only writes the new .tableinfo file to the file system. Orphan tables can cause e.g. ServerCrashProcedures to stuck, you might need to fix these still after you generated the missing table info files.
replication [OPTIONS] [<TABLENAME>...] Options: -f, --fix fix any replication issues found. Looks for undeleted replication queues and deletes them if passed the '--fix' option. Pass a table name to check for replication barrier and purge if '--fix'.
reportMissingRegionsInMeta <NAMESPACE|NAMESPACE:TABLENAME>... To be used when regions missing from hbase:meta but directories are present still in HDFS. Can happen if user has run _hbck1_ 'OfflineMetaRepair' against an hbase-2.x cluster. This is a CHECK only method, designed for reporting purposes and doesn't perform any fixes, providing a view of which regions (if any) would get re-added to hbase:meta, grouped by respective table/namespace. To effectively re-add regions in meta, run addFsRegionsMissingInMeta. This command needs hbase:meta to be online. For each namespace/table passed as parameter, it performs a diff between regions available in hbase:meta against existing regions dirs on HDFS. Region dirs with no matches are printed grouped under its related table name. Tables with no missing regions will show a 'no missing regions' message. If no namespace or table is specified, it will verify all existing regions. It accepts a combination of multiple namespace and tables. Table names should include the namespace portion, even for tables in the default namespace, otherwise it will assume as a namespace value. An example triggering missing regions execute for tables 'table_1' and 'table_2', under default namespace: $ HBCK2 reportMissingRegionsInMeta default:table_1 default:table_2 An example triggering missing regions execute for table 'table_1' under default namespace, and for all tables from namespace 'ns1': $ HBCK2 reportMissingRegionsInMeta default:table_1 ns1 Returns list of missing regions for each table passed as parameter, or for each table on namespaces specified as parameter.
setRegionState <ENCODED_REGIONNAME> <STATE> Possible region states: OFFLINE, OPENING, OPEN, CLOSING, CLOSED, SPLITTING, SPLIT, FAILED_OPEN, FAILED_CLOSE, MERGING, MERGED, SPLITTING_NEW, MERGING_NEW, ABNORMALLY_CLOSED WARNING: This is a very risky option intended for use as last resort. Example scenarios include unassigns/assigns that can't move forward because region is in an inconsistent state in 'hbase:meta'. For example, the 'unassigns' command can only proceed if passed a region in one of the following states: SPLITTING|SPLIT|MERGING|OPEN|CLOSING Before manually setting a region state with this command, please certify that this region is not being handled by a running procedure, such as 'assign' or 'split'. You can get a view of running procedures in the hbase shell using the 'list_procedures' command. An example setting region 'de00010733901a05f5a2a3a382e27dd4' to CLOSING: $ HBCK2 setRegionState de00010733901a05f5a2a3a382e27dd4 CLOSING Returns "0" if region state changed and "1" otherwise.
setTableState <TABLENAME> <STATE> Possible table states: ENABLED, DISABLED, DISABLING, ENABLING To read current table state, in the hbase shell run: hbase> get 'hbase:meta', '<TABLENAME>', 'table:state' A value of \x08\x00 == ENABLED, \x08\x01 == DISABLED, etc. Can also run a 'describe "<TABLENAME>"' at the shell prompt. An example making table name 'user' ENABLED: $ HBCK2 setTableState users ENABLED Returns whatever the previous table state was.
scheduleRecoveries <SERVERNAME>... Schedule ServerCrashProcedure(SCP) for list of RegionServers. Format server name as '<HOSTNAME>,<PORT>,<STARTCODE>' (See HBase UI/logs). Example using RegionServer 'a.example.org,29100,1540348649479': $ HBCK2 scheduleRecoveries a.example.org,29100,1540348649479 Returns the pid(s) of the created ServerCrashProcedure(s) or -1 if no procedure created (see master logs for why not). Command support added in hbase versions 2.0.3, 2.1.2, 2.2.0 or newer.
recoverUnknown Schedule ServerCrashProcedure(SCP) for RegionServers that are reported as unknown. Returns the pid(s) of the created ServerCrashProcedure(s) or -1 if no procedure created (see master logs for why not). Command support added in hbase versions 2.2.7, 2.3.5, 2.4.3, 2.5.0 or newer.
unassigns <ENCODED_REGIONNAME>... Options: -o,--override override ownership by another procedure A 'raw' unassign that can be used even during Master initialization (if the -skip flag is specified). Skirts Coprocessors. Pass one or more encoded region names. 1588230740 is the hard-coded name for the hbase:meta region and de00010733901a05f5a2a3a382e27dd4 is an example of what a userspace encoded region name looks like. For example: $ HBCK2 unassigns 1588230740 de00010733901a05f5a2a3a382e27dd4 Returns the pid(s) of the created UnassignProcedure(s) or -1 if none.
SEE ALSO, org.apache.hbase.hbck1.OfflineMetaRepair, the offline hbase:meta tool. See the HBCK2 README for how to use.