redis中的哈希表和渐近式rehash（redis6.0.15）

2022-10-14 13:27:04

首先，redis中的哈希表的数据结构是这样的。

typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

table成员是一个dictEntry类型的二级指针，为什么是二级指针呢？dictht又是什么类型呢？

size成员是目前哈希表的总大小

sizemask是什么？这个后面会讲到。

used是目前这个哈希表中以及容纳了多少数据。

dictht的类型如下：

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

从dictEntry.next元素中就可以看出，这是一个链表，并且，key和val分别对应哈希表的键和值。因此，可以很清晰地推断除redis是采用开链法来解决哈希冲突的。

回到刚才的问题来，dictht.table成员是个什么东西呢？我这边画了一个图。

这样的话，dictht的结构就很清楚了。

rehash

因为我们知道，redis中哈希表的元素可能是非常多的，如果采用常规的rehash的方式，可能会导致进程阻塞，这样就没办法很快的响应用户的请求了。为了解决这个问题，redis采用了渐进式rehash。redis在dictht上又做了一层封装，可以看下面的代码：

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    unsigned long iterators; /* number of iterators currently running */
} dict;

ht[2]表示有两个dictht，这也是渐进式rehash的由来。src/dict.c文件中存在一个私有的函数，这个函数就是用来判断是否需要进行rehash。重点在14行。

/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
    /* Incremental rehashing already in progress. Return. */
    if (dictIsRehashing(d)) return DICT_OK;

    /* If the hash table is empty expand it to the initial size. */
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

    /* If we reached the 1:1 ratio, and we are allowed to resize the hash
     * table (global setting) or we should avoid it but the ratio between
     * elements/buckets is over the "safe" threshold, we resize doubling
     * the number of buckets. */
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2);	// 二倍扩容，
    }
    return DICT_OK;
}

这个函数的逻辑是这样的，如果如果used >= size，说明负载因子大于1了，并且dict_can_resize标志不为0或者是used超过size的5倍，那么这个时候就进行rehash。这个时候可能还是有点不太清楚，下面再看一段代码和注释就很清楚了。

/* Using dictEnableResize() / dictDisableResize() we make possible to
 * enable/disable resizing of the hash table as needed. This is very important
 * for Redis, as we use copy-on-write and don't want to move too much memory
 * around when there is a child performing saving operations.
 *
 * Note that even when dict_can_resize is set to 0, not all resizes are
 * prevented: a hash table is still allowed to grow if the ratio between
 * the number of elements and the buckets > dict_force_resize_ratio. */
static int dict_can_resize = 1;
static unsigned int dict_force_resize_ratio = 5;

**前面一段表达的意思是使用dictEnableResize 或者 dictDisableResize函数可以对全局变量dict_can_resize进行设置。另外还有一点就是，如果redis正在进行持久化的时候，不希望进行resize，这个时候会调用dictDisableResize函数对rehash操作进行抑制。但也不是禁止rehash，如果超过dict_force_resize_ratio设定的阈值，那么这个时候就要强制进行rehash了。因此dict_force_resize_ratio这个变量是一个容忍阈值。**也可以看一下这段英文注释，还是很幽默的哈哈哈哈。

expand函数主要是创建一个新的hashht，并且分配初始值。

/* Expand or create the hash table */
int dictExpand(dict *d, unsigned long size)
{
    /* the size is invalid if it is smaller than the number of
     * elements already inside the hash table */
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    dictht n; /* the new hash table */
    unsigned long realsize = _dictNextPower(size);	// 这个函数就是对4不断乘2，直到超过size后返回，当然需要考虑溢出

    /* Rehashing to the same table size is not useful. */
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* Allocate the new hash table and initialize all pointers to NULL */
    n.size = realsize;
    n.sizemask = realsize-1;
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;

    /* Is this the first initialization? If so it's not really a rehashing
     * we just set the first hash table so that it can accept keys. */
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    /* Prepare a second hash table for incremental rehashing */
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

_dictNextPower函数返回的是一个realsize,它的值等于2的整数幂，16行中，把realsize的值赋给了size，然后将realsize-1的值赋给了sizemask，到这里，sizemask的用处就比较清楚了。举个例子，比如有一个二进制的值 100000，减一就得到011111，那么这个时候用一个k & sizemask就得到哈希槽的索引了。当然，sizemask还有其他的用处，比较复杂。

设置好dictht结构体中成员的初始值之后，就将其赋值给了d->ht[1]了，作为新的hash表，同时rehashidx设置为0，表示这个时候有正在进行渐进式rehash的处理。在这个过程中，ht[0]不会被释放，也不会进行insert插入数据，所有的插入数据会插入到ht[1]中去。redis会在特定的时候对ht[0]中的元素搬运到ht[1]中去，知道ht[0]的used变量为0，这个时候才会进行释放。

以下是这个过程的代码：

/* Performs N steps of incremental rehashing. Returns 1 if there are still
 * keys to move from the old to the new hash table, otherwise 0 is returned.
 *
 * Note that a rehashing step consists in moving a bucket (that may have more
 * than one key as we use chaining) from the old to the new hash table, however
 * since part of the hash table may be composed of empty spaces, it is not
 * guaranteed that this function will rehash even a single bucket, since it
 * will visit at max N*10 empty buckets in total, otherwise the amount of
 * work it does would be unbound and the function may block for a long time. */
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* Max number of empty buckets to visit. */
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx];
        /* Move all the keys in this bucket from the old to the new hash HT */
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* Check if we already rehashed the whole table... */
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }

    /* More to rehash... */
    return 1;
}

从注释来看，n指的是桶的个数，并且不管桶里面是否有元素，都只遍历n * 10个桶就结束（因为如果很多桶都为空的话，那么可能要遍历很久，会对性能产生影响）。43行的代码就是检查gt[0]是否为空，如果为空，那么就将原来的空间释放掉，将ht[1]赋值为ht[0]，rehashidx设置为-1，表示rehash过程结束。

另外还有一点，rehashidx并不是只有0 和 -1这两个值的， rehashidx还用作桶的索引，仔细阅读14行开始的代码就可以发现。

码农公寓

rehash

相关文章