java学习之--- 掌握HashMap

最近几天，一直在学习 HashMap 的底层实现，发现关于 HashMap 实现的博客文章还是很多的，对比了一些，都没有一个很全面的文章来做总结，本篇文章也断断续续结合源码写了一下，如果有理解不当之处，欢迎指正！
Map结构先上图
在这里插入图片描述
在程序编程的时候，HashMap 是一个使用非常频繁的容器类，它允许键值都放入 null 元素。除该类方法未实现同步外，其余跟 Hashtable 大致相同，但跟 TreeMap 不同，该容器不保证元素顺序，根据需要该容器可能会对元素重新哈希，元素的顺序也会被重新打散，因此不同时间迭代同一个 HashMap 的顺序可能会不同。
HashMap 容器，实质还是一个哈希数组结构，但是在元素插入的时候，存在发生 hash 冲突的可能性；

在 jdk1.7 中，HashMap 主要是由数组+链表组成，当发生 hash 冲突的时候，就将冲突的元素放入链表中。
从 jdk1.8 开始，HashMap 主要是由数组+链表+红黑树实现的，相比 jdk1.7 而言，多了一个红黑树实现。当链表长度超过 8 的时候，就将链表变成红黑树，如图所示。
在这里插入图片描述
打开HashMap的源码之前，我们先温故一下基础知识;

第一部分:基础入门

HashMap数据结构均有数组+链表；
数组的结构
在这里插入图片描述
数组结构的特点是:是将元素在内存中连续存储的；它的优点：因为数据是连续存储的，内存地址连续，所以在查找数据的时候效率比较高；缺点也很明显:插入删除操作的时候，改动大;
链表的数据结构
在这里插入图片描述
链表的数据特点:是动态申请内存空间，不需要像数组需要提前申请好内存的大小，链表只需在用的时候申请就可以，根据需要来动态申请或者删除内存空间，对于数据增加和删除以及插入比数组灵活。还有就是链表中数据在内存中可以在任意的位置，通过应用来关联数据；缺点就是，因为内存地址是不连续的，所以查询的时候速度慢；

那么，有没有一种数据结构，既可以查询快，又可以增删快呢？答案是由有的；
散列表:数组的快速索引，链表的动态扩容;
在这里插入图片描述

接下来，了解一下哈希；
Hash也称为散列，哈希，对应的英文是Hash。基本原理就是把任意长度的输入，通过Hash算法变为固定长度的输出。这个映射的规则对应的就是Hash算法，而原始数据映射后的而进行串就是哈希值；
哈希的特点是:
1.从hash值不可以反向推导出原始数据
2.输入数据的微笑变化会得到完全不同的hash值，相同的数据会得到相同的值；
3.哈希算法的执行效率高效，长的文本也可以快速的算出哈希值
4.hash算法的冲突概率小
由于hash原理是将输入空间的值映射成hash空间内，而hash值的空间远小于输入的空间，根据抽屉原理，一定会存在不同的输入被映射为相同的输出情况

第二部分:HashMap原理讲解

HashMap中的Node数据结构分析:

 static class Node<K,V> implements Map.Entry<K,V> {
    
    
        final int hash;   // 存储hash值 k.hash 得到的一个值
        final K key;     //  key值
        V value;        // Value
        Node<K,V> next;   //hash碰撞后，通过next进行链起来

        Node(int hash, K key, V value, Node<K,V> next) {
    
    
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        {
    
     return key; }
        public final V getValue()      {
    
     return value; }
        public final String toString() {
    
     return key + "=" + value; }

        public final int hashCode() {
    
    
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
    
    
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
    
    
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
    
    
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

在这里插入图片描述
HashMap转为红黑树的条件是:只有当链表中的元素个数⼤于8，并且数组的⻓度⼤于等于64时才会将链表转为红⿊树。

第三部分:解读源代码

1.核心属性分析:

  /**
     * The default initial capacity - MUST be a power of two.
     * 缺省table大小
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     * table最大长度
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

   /**
     * The load factor used when none specified in constructor.
     * 负载因子
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     * 树化阈值
     */
    static final int TREEIFY_THRESHOLD = 8;

  /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     * 树降级成微链表的阈值
     */
    static final int UNTREEIFY_THRESHOLD = 6;

  /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     * 树化的另一个参数，当哈希表中的所有元素个数超过64时，才会允许树化
     */
    static final int MIN_TREEIFY_CAPACITY = 64;

2.HashMap中的方法:put方法分析:

    /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
    
    
         // table:   引用当前hash的散列表
         // p:       表示当前散列表元素
         // n:       表示当前散列表数组长度
         // i:       表示路由寻址结果
        Node<K,V>[] tab; Node<K,V> p; int n, i;

      // 延迟初始化逻辑，第一次调用putVal时，会初始化hashMap对象中的最消耗内存的散列表
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        
     // put值的第一种情况: 寻找桶位的时候，刚好是null，这个时候，直接将key-V ->Node 放进去即可
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
            
        else {
    
    
        // e:  不为null的话，表示找到了一个当前要插入的key-value一致的key元素
        // k:  表示临时的一个key
            Node<K,V> e; K k;

     // put值的第二种情况: 表示当前桶位中的元素，与你要插入的元素key完全一致，后续需要进行替换操作
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;

    // put值的第三种情况: 表示达到了转为红黑树操作了
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
                
    // put值的第四种情况:往链表中继续添加数据 ;是链表的情况，而且链表的头元素与我们要插入的key不一致         
            else {
    
    
                for (int binCount = 0; ; ++binCount) {
    
    
                
          //  说明迭代到最后一个元素，条件成立；也没有找到一个与你要插入的key一致的node， 说明需要加入到当前链表的末尾就行
                    if ((e = p.next) == null) {
    
    
                        p.next = newNode(hash, key, value, null);
  
                    // 达到树化标准了，需要进行树化操作；
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        // 树化操作
                            treeifyBin(tab, hash);
                        break;
                    }
                    
                 //找到了一个与你hash值的key一致的，需要进行替换操作
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }

      // e不等于null ，条件成立说明，找到一个与你插入元素key一致的数据，需要进行替换
            if (e != null) {
    
     // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 表示散列表结果被修改的次数，替换Node元素的value不计数
        ++modCount;
      // 插入新元素，size自增；如果自增后的值 大于扩容阈值，就要进行扩容；
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

源码中另一个比较重要的方法，就是扩容方法resize()

 /**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     * 扩容方法，返回一个大的table
     */
    final Node<K,V>[] resize() {
    
    
    // oldTable: 扩容前的hash表
    // oldCap: 扩容前的table长度
    // oldThr: 表示扩容之前的 扩容阈值；触发本次扩容的阈值
    // newCap: 扩容之后想要达到的table大小
    // newThr：表示扩容之后，下次再次触发扩容的条件
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;


    // 条件如果成立，说明hashmap中的散列表已经初始化了，是一次正常的扩容
        if (oldCap > 0) {
    
    
        // 扩容之前的table数组大小达到最大阈值后，则不扩容，且设置扩容条件为int 最大值
            if (oldCap >= MAXIMUM_CAPACITY) {
    
    
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }

     // oldCap左移一位实现数据翻倍，并且赋值给newCap，newCap小于最大值限制，且扩容之前的阈值大于等于 16
    //  这种情况下，下一次的扩容阈值 等于当前阈值翻倍
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }

  
      // oldCap == 0的情况 ,说明hashMap的散列表示null 
     // 1.new HashMap(initCap,loadFactor);
     // 2.new HashMap（initCap);
     // 3.new HashMap(map);  并且map有数据
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
            
     // oldCap == 0的情况    oldThr == 0
    // new HashMap
        else {
    
                   // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        //   newThr为0 ，通过newCap和loadFactor计算newThr
        if (newThr == 0) {
    
    
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;


        @SuppressWarnings({
    
    "rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
    
    
            for (int j = 0; j < oldCap; ++j) {
    
    
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
    
    
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else {
    
     // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
    
    
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
    
    
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
    
    
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
    
    
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
    
    
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

总结流程如下:
在这里插入图片描述

扫描二维码关注公众号，回复： 13431183 查看本文章

参考资料:
1、美团技术团队 - Java 8系列之重新认识HashMap: https://zhuanlan.zhihu.com/p/21673805

2、简书 - JDK1.8红黑树实现分析-此鱼不得水: https://www.jianshu.com/p/34b6878ae6de

3、简书 - JJDK 1.8 中 HashMap 扩容: https://www.jianshu.com/p/bdfd5f98cc31

4、Java HashMap 基础面试常见问题: https://www.rabbitwfly.com/articles/2019/04/23/1556021848567.html