文章目录

dict

函数

添加

dictAdd
dictAddRow

查找

_dictKeyIndex
dictAddOrFind
dictFind
dictFetchValue

删除
更改
释放

关于作者

dict

终于到了哈希表的最后一篇了，终于又到了源码部分了。只要理解了前两篇的原理部分，那么源码解读其实就是把原理转换成代码的过程。也可以说是对我们阐述的原理的验证。废话不说，直接干。

函数

所有的数据结构都是围绕增删改查来进行的，前提是我们先创建出来这个结构，而且C语言中还要在不使用的时候释放这个结构占用的内存。

添加

杯子是装来喝水的，如果是空的呢？先往里盛啊。所以首先来探究添加的源码。

dictAdd

高瞻远瞩，先从最上层的函数看，调用dictAddRow添加一个键值对，如果创建成功，那么把键值对的的值设置为传入的值。这样看来创建键值对的时候key和value部分没有初始化为传入的值。

int dictAdd(dict *d, void *key, void *val)
{
    dictEntry *entry = dictAddRaw(d,key,NULL);

    if (!entry) return DICT_ERR;
    dictSetVal(d, entry, val);
    return DICT_OK;
}

dictAddRow

这里是真正执行添加键值对动作的函数，来看一下是否和之前三儿讲解的原理一致。

在任何操作前判断是否在rehash过程，如果有则进行一部分rehash工作
获得经过哈希函数后映射的数组索引位置，-1代表这个键已经存在，存在就退出
根据是否在rehash选择需要操作的哈希表。如果没有进行rehash选择0号，否则选择1号
将创建的键值对采用头插法插入对应数组的链表中
更新used成员，将来用于判断是否需要进行rehash
设置键值对的键

设置键值对的值（这一步在dictAdd中）

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
long index;
dictEntry *entry;
dictht *ht;

 if (dictIsRehashing(d)) _dictRehashStep(d);

 /* Get the index of the new element, or -1 if
  * the element already exists. */
 if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
     return NULL;

 /* Allocate the memory and store the new entry.
  * Insert the element in top, with the assumption that in a database
  * system it is more likely that recently added entries are accessed
  * more frequently. */
 ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
 entry = zmalloc(sizeof(*entry));
 entry->next = ht->table[index];
 ht->table[index] = entry;
 ht->used++;

 /* Set the hash entry fields. */
 dictSetKey(d, entry, key);
 return entry;

}

查找

哈希表的算法很多操作都依赖查找，所以先来看一看查找的几个相关函数。

_dictKeyIndex

在dictAddRow中调用了_dictKeyIndex，所以这里把它放在首位，尽量把一个操作串起来。

根据哈希值和掩码得到需要映射的数组索引
遍历对应的链表，如果键已经存在则记录并退出，否则进入下一步

如果正在rehash的过程有可能会在1号哈希表中，所以去1号哈希表中执行相同操作。

static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
{
unsigned long idx, table;
dictEntry *he;
if (existing) *existing = NULL;

 /* Expand the hash table if needed */
 if (_dictExpandIfNeeded(d) == DICT_ERR)
     return -1;
 for (table = 0; table <= 1; table++) {
     idx = hash & d->ht[table].sizemask;
     /* Search if this slot does not already contain the given key */
     he = d->ht[table].table[idx];
     while(he) {
         if (key==he->key || dictCompareKeys(d, key, he->key)) {
             if (existing) *existing = he;
             return -1;
         }
         he = he->next;
     }
     if (!dictIsRehashing(d)) break;
 }
 return idx;

}

看到这里，三儿隐约觉得昨天的文章有一个bug，就是rehash过程中查找操作对于两个表的查找顺序写反了。大家多多监督啊。其实三儿都会在发表前自己看三到四遍，还是粗心。

dictAddOrFind

通过这个函数可以看到dictAddRaw不只可以完成添加的操作，同时可以完成查找，而查找在dict算法中是无处不在，有就是说redis并没有使用上述_dictKeyIndex去频繁的执行查找，而是使用dictAddRaw，这是因为这样的设计使得dictAddRaw可以复用_dictKeyIndex代码，其次dictAddRaw返回的是键值对，而且可以根据existing参数这个键值对是新添加的还是已经存在的，这样对其它操作更为方便，具体怎么方便会在更新函数部分呈现。

dictEntry *dictAddOrFind(dict *d, void *key) {
    dictEntry *entry, *existing;
    entry = dictAddRaw(d,key,&existing);
    return entry ? entry : existing;
}

dictFind

dictAddOrFind的作用比较强大，如果已经确定了就是查找，只能调用上述函数吗？当然不是了，接着往下看吧。

如果是空表，能怎么办，通知不存在吧。
如果需要rehash先rehash，毕竟效率要紧。

遍历链表查找和哈希表查找

dictEntry *dictFind(dict *d, const void *key)
{
dictEntry *he;
uint64_t h, idx, table;

  if (d->ht[0].used + d->ht[1].used == 0) return NULL; /* dict is empty */
  if (dictIsRehashing(d)) _dictRehashStep(d);
  h = dictHashKey(d, key);
  for (table = 0; table <= 1; table++) {
      idx = h & d->ht[table].sizemask;
      he = d->ht[table].table[idx];
      while(he) {
          if (key==he->key || dictCompareKeys(d, key, he->key))
              return he;
          he = he->next;
      }
      if (!dictIsRehashing(d)) return NULL;
  }
  return NULL;

}

不说别的，如果三儿写这段代码，在调用dictHashKey后就要if了，你说redis这里的处理很难吗，不难啊，就是自己想不到。三儿觉得没有聪明不聪明，只有自己见没见过。大佬例外啊。

dictFetchValue

给出这个函数的目的是为了抛出一个问题。

void *dictFetchValue(dict *d, const void *key) {
    dictEntry *he;

    he = dictFind(d,key);
    return he ? dictGetVal(he) : NULL;
}

dictFetchValue也是首先数判断键值对是否存在，那这里用dictAddRow不行吗，两者的返回值都一样，实现上没有什么不行！那么为什么有的时候查找用dictAddRaw，有时候用dictFind呢？根据分析的函数，dictAddRaw判断存在与否基本是在与添加有关的函数中，而dictFind是在其它函数中。所以三儿认为是代码可读性的问题，毕竟代码是给人看的。如果同学们有别的理解可以分享出来，可以给后台留言或者加入521625004群。

删除

删除就是在查找后如果存在的话将其从链表中移除。

是空表就退出不浪费时间。
为了效率如果需要rehash就先进行rehash

找到对应的键值对从链表中移除

  int dictDelete(dict *ht, const void *key) {
  return dictGenericDelete(ht,key,0) ? DICT_OK : DICT_ERR;

}

static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
uint64_t h, idx;
dictEntry *he, *prevHe;
int table;

  if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL;

  if (dictIsRehashing(d)) _dictRehashStep(d);
  h = dictHashKey(d, key);

  for (table = 0; table <= 1; table++) {
      idx = h & d->ht[table].sizemask;
      he = d->ht[table].table[idx];
      prevHe = NULL;
      while(he) {
          if (key==he->key || dictCompareKeys(d, key, he->key)) {
              /* Unlink the element from the list */
              if (prevHe)
                  prevHe->next = he->next;
              else
                  d->ht[table].table[idx] = he->next;
              if (!nofree) {
                  dictFreeKey(d, he);
                  dictFreeVal(d, he);
                  zfree(he);
              }
              d->ht[table].used--;
              return he;
          }
          prevHe = he;
          he = he->next;
      }
      if (!dictIsRehashing(d)) break;
  }
  return NULL; /* not found */

}

可以看到这个函数和dictFind基本是一样的流程，就是核心操作不一样。这里有一个问题，为什么如果不是空表rehash的工作进行后为什么不进行查找是否存在的操作，三儿猜测是因为效率的问题，如果执行一遍查找，那么对于存在的键值对就会重复两边查找的操作，效率就降低了。当然，这只是猜测，如果是基于这个目的，三儿觉得可以用一个方法。

type opt func(*node)

func tranvs(n *node, key int,  fn opt) {
    for n != nil {
        fn(n)    
        n = n.next
    }            
}

对链表迭代，函数参数需要一个函数指针和一个键，我们把核心操作抽象成函数指针。这里不严谨，说的概念都是C中的概念，不过上述这段描述代码使用的是golang。golang中的应该是函数类型，毕竟函数是一等公民。

更改

之前在dictAddOrFind中说了dictAddRaw其它功能会在更新中的部分讲解，其它功能就是指针验证存在，在抛出的那个问题中也说了dictAddRaw验证存在的问题多用在和添加相关的操作中。redis中是部分更新和添加的，如果更新的时候没有就先添加。所以这里验证存在使用dictAddRaw，这也进一步验证了之前三儿的猜测。

int dictReplace(dict *d, void *key, void *val)
{
    dictEntry *entry, *existing, auxentry;

    /* Try to add the element. If the key
     * does not exists dictAdd will succeed. */
    entry = dictAddRaw(d,key,&existing);
    if (entry) {
        dictSetVal(d, entry, val);
        return 1;
    }

    /* Set the new value and free the old one. Note that it is important
     * to do that in this order, as the value may just be exactly the same
     * as the previous one. In this context, think to reference counting,
     * you want to increment (set), and then decrement (free), and not the
     * reverse. */
    auxentry = *existing;
    dictSetVal(d, existing, val);
    dictFreeVal(d, &auxentry);
    return 0;
}

释放

步骤都比较简单，就是把每个节点都释放，然后最后释放结构。

void dictRelease(dict *d)
{
    _dictClear(d,&d->ht[0],NULL);
    _dictClear(d,&d->ht[1],NULL);
    zfree(d);
}

int _dictClear(dict *d, dictht *ht, void(callback)(void *)) {
    unsigned long i;

    /* Free all the elements */
    for (i = 0; i < ht->size && ht->used > 0; i++) {
        dictEntry *he, *nextHe;

        if (callback && (i & 65535) == 0) callback(d->privdata);

        if ((he = ht->table[i]) == NULL) continue;
        while(he) {
            nextHe = he->next;
            dictFreeKey(d, he);
            dictFreeVal(d, he);
            zfree(he);
            ht->used--;
            he = nextHe;
        }
    }
    /* Free the table and the allocated cache structure */
    zfree(ht->table);
    /* Re-initialize the table */
    _dictReset(ht);
    return DICT_OK; /* never fails */
}

关于作者

大四学生一枚，分析数据结构，面试题，golang，C语言等知识。QQ交流群：521625004。微信公众号：后台技术栈。

redis-dict下