随着运行时间的增加,memtable会慢慢 转化成 sstable。
sstable会越来越多 我们就需要进行整合 compact
代码会在写入查询key值 db写入时等多出位置调用MaybeScheduleCompaction ()
检测是否需要进行compact
void DBImpl::MaybeScheduleCompaction() {
mutex_.AssertHeld();
if (bg_compaction_scheduled_) {
// Already scheduled
} else if (shutting_down_.Acquire_Load()) {
// DB is being deleted; no more background compactions
} else if (imm_ == NULL &&
manual_compaction_ == NULL &&
!versions_->NeedsCompaction()) {
// No work to be done
} else {
bg_compaction_scheduled_ = true;
env_->Schedule(&DBImpl::BGWork, this);
}
} void DBImpl::BGWork(void* db) {
reinterpret_cast<DBImpl*>(db)->BackgroundCall();
} void DBImpl::BackgroundCall() {
MutexLock l(&mutex_);
assert(bg_compaction_scheduled_);
if (!shutting_down_.Acquire_Load()) {
BackgroundCompaction();
}
bg_compaction_scheduled_ = false; // Previous compaction may have produced too many files in a level,
// so reschedule another compaction if needed.
MaybeScheduleCompaction();
bg_cv_.SignalAll();
}
实际进行compact的函数是 void DBImpl::BackgroundCompaction()
1 手动触发情况下 会填写class DBImpl下的一个变量 ManualCompaction manual_compaction_
struct ManualCompaction {
int level;
bool done;
const InternalKey* begin; // NULL means beginning of key range
const InternalKey* end; // NULL means end of key range
InternalKey tmp_storage; // Used to keep track of compaction progress
};
为了避免外部指定的 key-range 过大,一次 compact 过多的 sstable 文件, manual_compaction 可能不会一次做完,所以有 done 来标
识是否已经全部完成, tmp_storage 保存上一次 compact 到的 end-key,即下一次的 startkey。
指定的beg end KEY会赋值到 versions_中,以便后面进行compact。 versions_->CompactRange(m->level, m->begin, m->end);
2 通过 versions_->PickCompaction() 选择需要compact的level 和 key range
Compaction* VersionSet::PickCompaction() {
Compaction* c;
int level; // We prefer compactions triggered by too much data in a level over
// the compactions triggered by seeks.
const bool size_compaction = (current_->compaction_score_ >= );
const bool seek_compaction = (current_->file_to_compact_ != NULL);
if (size_compaction) {
level = current_->compaction_level_;
assert(level >= );
assert(level+ < config::kNumLevels);
c = new Compaction(level); // Pick the first file that comes after compact_pointer_[level]
for (size_t i = ; i < current_->files_[level].size(); i++) {
FileMetaData* f = current_->files_[level][i];
if (compact_pointer_[level].empty() ||
icmp_.Compare(f->largest.Encode(), compact_pointer_[level]) > ) {
c->inputs_[].push_back(f);
break;
}
}
if (c->inputs_[].empty()) {
// Wrap-around to the beginning of the key space
c->inputs_[].push_back(current_->files_[level][]);
}
} else if (seek_compaction) {
level = current_->file_to_compact_level_;
c = new Compaction(level);
c->inputs_[].push_back(current_->file_to_compact_);
} else {
return NULL;
} c->input_version_ = current_;
c->input_version_->Ref(); // Files in level 0 may overlap each other, so pick up all overlapping ones
if (level == ) {
InternalKey smallest, largest;
GetRange(c->inputs_[], &smallest, &largest);
// Note that the next call will discard the file we placed in
// c->inputs_[0] earlier and replace it with an overlapping set
// which will include the picked file.
current_->GetOverlappingInputs(, &smallest, &largest, &c->inputs_[]);
assert(!c->inputs_[].empty());
} SetupOtherInputs(c); return c;
}
PickCompaction函数中 根据 文件尺寸和被seek多次 来确认compact的文件
使用 c = new Compaction(level) 记录要compact的level和文件指针
todo 实际的compact操作 CompactMemTable() DoCompactionWork()
参考
《leveldb实现解析》 淘宝 那岩