2021SC@SDUSC
目录
一、简述
WAL的生命周期主要包括WAL的写入、滚动、失效和删除
二、WAL写入
三、WAL滚动
通过wal日志切换,这样可以避免产生单独的过大的wal日志文件,这样可以方便后续的日志清理(可以将过期日志文件直接删除)另外如果需要使用日志进行恢复时,也可以同时解析多个小的日志文件,缩短恢复所需时间。
wal触发切换的场景有如下几种:
- SyncRunner线程在处理日志同步后,如果有异常发生,就会调用requestLogRoll发起日志滚动请求
protected final void requestLogRoll(final WALActionsListener.RollRequestReason reason) {
// If we have already requested a roll, don't do it again
// And only set rollRequested to true when there is a registered listener
if (!this.listeners.isEmpty() && rollRequested.compareAndSet(false, true)) {
for (WALActionsListener i : this.listeners) {
i.logRollRequested(reason);
}
}
}
- SyncRunner线程在处理日志同步后, 检查当前在写的wal的日志大小是否超过配置{hbase.regionserver.hlog.blocksize默认为hdfs目录块大小}*{hbase.regionserver.logroll.multiplier默认0.95},超过后同样调用requestLogRoll发起日志滚动请求
public static final String WAL_ROLL_MULTIPLIER = "hbase.regionserver.logroll.multiplier";
private boolean isOutstandingSyncsFromRunners() {
// Look at SyncFutures in the SyncRunners
for (SyncRunner syncRunner: syncRunners) {
if(syncRunner.isAlive() && !syncRunner.areSyncFuturesReleased()) {
return true;
}
}
return false;
}
- 每个RegionServer有一个LogRoller线程会定期滚动日志,滚动周期由参数{hbase.regionserver.logroll.period默认值1个小时}控制
这里前面2种场景调用requestLogRoll发起日志滚动请求,最终也是通过LogRoller来执行日志滚动的操作。
protected static final String WAL_ROLL_PERIOD_KEY = "hbase.regionserver.logroll.period";
四、WAL失效
当memstore中的数据刷新到hdfs后,那对应的wal日志就不需要了,FSHLog中有记录当前memstore中各region对应的最老的sequenceId,如果一个日志中的各个region的操作的最新的sequenceId均小于wal中记录的各个需刷新的region的最老sequenceId,说明该日志文件就不需要了,于是就会将该日志文件从./WALs目录移动到./oldWALs目录。这块是在前面日志滚动完成后调用cleanOldLogs来处理的。
public RegionStoreSequenceIds getLastSequenceId(byte[] encodedRegionName) {
try {
GetLastFlushedSequenceIdRequest req =
RequestConverter.buildGetLastFlushedSequenceIdRequest(encodedRegionName);
RegionServerStatusService.BlockingInterface rss = rssStub;
if (rss == null) { // Try to connect one more time
createRegionServerStatusStub();
rss = rssStub;
if (rss == null) {
// Still no luck, we tried
LOG.warn("Unable to connect to the master to check " + "the last flushed sequence id");
return RegionStoreSequenceIds.newBuilder().setLastFlushedSequenceId(HConstants.NO_SEQNUM)
.build();
}
}
GetLastFlushedSequenceIdResponse resp = rss.getLastFlushedSequenceId(null, req);
return RegionStoreSequenceIds.newBuilder()
.setLastFlushedSequenceId(resp.getLastFlushedSequenceId())
.addAllStoreSequenceId(resp.getStoreLastFlushedSequenceIdList()).build();
} catch (ServiceException e) {
LOG.warn("Unable to connect to the master to check the last flushed sequence id", e);
return RegionStoreSequenceIds.newBuilder().setLastFlushedSequenceId(HConstants.NO_SEQNUM)
.build();
}
}
五、WAL删除
由于wal日志还会用于跨集群的同步处理,所以wal日志失效后并不会立即删除,而是移动到oldWALs目录。由HMaster中的LogCleaner这个Chore线程来负责wal日志的删除,在LogCleaner内部通过参数{hbase.master.logcleaner.plugins}以插件的方式来筛选出可以删除的日志文件。目前配置的插件有ReplicationLogCleaner、SnapshotLogCleaner和TimeToLiveLogCleaner
- TimeToLiveLogCleaner: 日志文件最后修改时间在配置参数{hbase.master.logcleaner.ttl默认600秒}之前的可以删除
public static final String TTL_CONF_KEY = "hbase.master.logcleaner.ttl";
- ReplicationLogCleaner:如果有跨集群数据同步的需求,通过该Cleaner来保证那些在同步中的日志不被删除
import org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner;
public static void decorateMasterConfiguration(Configuration conf) {
String plugins = conf.get(HBASE_MASTER_LOGCLEANER_PLUGINS);
String cleanerClass = ReplicationLogCleaner.class.getCanonicalName();
if (!plugins.contains(cleanerClass)) {
conf.set(HBASE_MASTER_LOGCLEANER_PLUGINS, plugins + "," + cleanerClass);
}
if (ReplicationUtils.isReplicationForBulkLoadDataEnabled(conf)) {
plugins = conf.get(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS);
cleanerClass = ReplicationHFileCleaner.class.getCanonicalName();
if (!plugins.contains(cleanerClass)) {
conf.set(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS, plugins + "," + cleanerClass);
}
}
}
- SnapshotLogCleaner: 被表的snapshot使用到了的wal不被删除
snapshot的实现.