2021SC@SDUSC HBase(十四)项目代码分析——WAL生命周期

2021SC@SDUSC

目录

一、简述

WAL的生命周期主要包括WAL的写入、滚动、失效和删除

二、WAL写入

WAL写入详解

三、WAL滚动

通过wal日志切换,这样可以避免产生单独的过大的wal日志文件,这样可以方便后续的日志清理(可以将过期日志文件直接删除)另外如果需要使用日志进行恢复时,也可以同时解析多个小的日志文件,缩短恢复所需时间。
wal触发切换的场景有如下几种:

  1. SyncRunner线程在处理日志同步后,如果有异常发生,就会调用requestLogRoll发起日志滚动请求
 protected final void requestLogRoll(final WALActionsListener.RollRequestReason reason) {
    // If we have already requested a roll, don't do it again
    // And only set rollRequested to true when there is a registered listener
    if (!this.listeners.isEmpty() && rollRequested.compareAndSet(false, true)) {
      for (WALActionsListener i : this.listeners) {
        i.logRollRequested(reason);
      }
    }
  }
  1. SyncRunner线程在处理日志同步后, 检查当前在写的wal的日志大小是否超过配置{hbase.regionserver.hlog.blocksize默认为hdfs目录块大小}*{hbase.regionserver.logroll.multiplier默认0.95},超过后同样调用requestLogRoll发起日志滚动请求
public static final String WAL_ROLL_MULTIPLIER = "hbase.regionserver.logroll.multiplier";
private boolean isOutstandingSyncsFromRunners() {
      // Look at SyncFutures in the SyncRunners
      for (SyncRunner syncRunner: syncRunners) {
        if(syncRunner.isAlive() && !syncRunner.areSyncFuturesReleased()) {
          return true;
        }
      }
      return false;
    }
  1. 每个RegionServer有一个LogRoller线程会定期滚动日志,滚动周期由参数{hbase.regionserver.logroll.period默认值1个小时}控制
    这里前面2种场景调用requestLogRoll发起日志滚动请求,最终也是通过LogRoller来执行日志滚动的操作。
protected static final String WAL_ROLL_PERIOD_KEY = "hbase.regionserver.logroll.period";

四、WAL失效

当memstore中的数据刷新到hdfs后,那对应的wal日志就不需要了,FSHLog中有记录当前memstore中各region对应的最老的sequenceId,如果一个日志中的各个region的操作的最新的sequenceId均小于wal中记录的各个需刷新的region的最老sequenceId,说明该日志文件就不需要了,于是就会将该日志文件从./WALs目录移动到./oldWALs目录。这块是在前面日志滚动完成后调用cleanOldLogs来处理的。

 public RegionStoreSequenceIds getLastSequenceId(byte[] encodedRegionName) {
    try {
      GetLastFlushedSequenceIdRequest req =
          RequestConverter.buildGetLastFlushedSequenceIdRequest(encodedRegionName);
      RegionServerStatusService.BlockingInterface rss = rssStub;
      if (rss == null) { // Try to connect one more time
        createRegionServerStatusStub();
        rss = rssStub;
        if (rss == null) {
          // Still no luck, we tried
          LOG.warn("Unable to connect to the master to check " + "the last flushed sequence id");
          return RegionStoreSequenceIds.newBuilder().setLastFlushedSequenceId(HConstants.NO_SEQNUM)
              .build();
        }
      }
      GetLastFlushedSequenceIdResponse resp = rss.getLastFlushedSequenceId(null, req);
      return RegionStoreSequenceIds.newBuilder()
          .setLastFlushedSequenceId(resp.getLastFlushedSequenceId())
          .addAllStoreSequenceId(resp.getStoreLastFlushedSequenceIdList()).build();
    } catch (ServiceException e) {
      LOG.warn("Unable to connect to the master to check the last flushed sequence id", e);
      return RegionStoreSequenceIds.newBuilder().setLastFlushedSequenceId(HConstants.NO_SEQNUM)
          .build();
    }
  }

五、WAL删除

由于wal日志还会用于跨集群的同步处理,所以wal日志失效后并不会立即删除,而是移动到oldWALs目录。由HMaster中的LogCleaner这个Chore线程来负责wal日志的删除,在LogCleaner内部通过参数{hbase.master.logcleaner.plugins}以插件的方式来筛选出可以删除的日志文件。目前配置的插件有ReplicationLogCleaner、SnapshotLogCleaner和TimeToLiveLogCleaner

  1. TimeToLiveLogCleaner: 日志文件最后修改时间在配置参数{hbase.master.logcleaner.ttl默认600秒}之前的可以删除
  public static final String TTL_CONF_KEY = "hbase.master.logcleaner.ttl";
  1. ReplicationLogCleaner:如果有跨集群数据同步的需求,通过该Cleaner来保证那些在同步中的日志不被删除
import org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner;
public static void decorateMasterConfiguration(Configuration conf) {
    String plugins = conf.get(HBASE_MASTER_LOGCLEANER_PLUGINS);
    String cleanerClass = ReplicationLogCleaner.class.getCanonicalName();
    if (!plugins.contains(cleanerClass)) {
      conf.set(HBASE_MASTER_LOGCLEANER_PLUGINS, plugins + "," + cleanerClass);
    }
    if (ReplicationUtils.isReplicationForBulkLoadDataEnabled(conf)) {
      plugins = conf.get(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS);
      cleanerClass = ReplicationHFileCleaner.class.getCanonicalName();
      if (!plugins.contains(cleanerClass)) {
        conf.set(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS, plugins + "," + cleanerClass);
      }
    }
  }
  1. SnapshotLogCleaner: 被表的snapshot使用到了的wal不被删除
    snapshot的实现.
上一篇:问题定位 | PostgreSQL 报错 requested WAL segment has already been removed


下一篇:SSTable 与 LSM 引擎