在日常工作中,HashMap是我们经常用到的一个类,同时在面试过程中面试官也很喜欢问的一个对象!
因为面试的过程中被问的多了,所以就抽时间看了看源码^-^
在面试过程中,经常被问到的就是数据结构方面的问题,所以问到HashMap的时候的,难免就会问HashMap的底层结构是什么:
最简单的回答就是: HashMap底层采用的数据和链表的实现方式(现在加入了红黑树)
以下从日常使用HashMap的角度来看看HashMap的源码:
Map<Object, Object> map = new HashMap<>(); //这是个人最常用的一种方法
HashMap的空参构造方法如下:
/** * Constructs an empty <tt>HashMap</tt> with the default initial capacity * (16) and the default load factor (0.75). */ public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted }
/** * The load factor for the hash table. * * @serial */ final float loadFactor;
可以看到,空参的构造函数其实仅仅设置一个装载因子,默认是0.75f, 前面提到HashMap底层是一个数据加链表结构,此时这个0.75指的就是数据的长度的0.75(也就是HashMap的容量),HashMap的默认容量是16
Constructs an empty <tt>HashMap</tt> with the default initial capacity(16)
先提前做一下准备工作:
因为以下都要涉及到Node<K,V>这个类,所以下面先解释一下这个类
/** * Basic hash bin node, used for most entries. (See below for * TreeNode subclass, and in LinkedHashMap for its Entry subclass.) */ static class Node<K,V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next; Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; } public final String toString() { return key + "=" + value; } public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); } public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; } public final boolean equals(Object o) { if (o == this) return true; if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>)o; if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue())) return true; } return false; } }
多次提到HashMap是一个数据加链表的结构,而这个Node<K,V>则是关键
HashMap实际上就是创建一个Node<K,V>的数组,再看Node<K,V>会发现,这个类是一个链表结构的类,简单介绍下其4个属性
hash : 保存的是要存储的key的哈希值
key: 传递进来key值
value: 传递进来要保存的value值
next: 下一个Node<K,V>对象 (链表结构的实现)
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); }
/** * Implements Map.put and related methods * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) */ transient Node<K,V>[] table;
从上面的代码中可以看到,在new HashMap()的时候,table其实是没有赋值的,也就是说第一次put的时候,table==null,所以会调用
n = (tab = resize()).length;
看resize()方法:
final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
当我们第一次put的时候:执行如下代码(剥离一下)
Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0;
else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab;
return newTab;
所以说其实HashMap的中数组的初始化,实际上是在我们第一次进行put操作的时候进行的!
当我们第二次进行put操作的时候(简单说一下过程,具体的代码还请读者自己看一下):
首先根据用户传入的key值计算得到key的hash值,然后再根据
(n - 1) & hash
可以获取到一个值: n为HashMap的容量(数组的长度),hash为key的哈希值,通过位运与算符(&)可以获得一个[0-n)的值,也就是数组的下标,
然后再根据数组的下标去获取值,如果获取的值为null,这直接创建一个Node<K,V>,然后加入到数据即可,如果不为空,在进行其他逻辑处理,具体细看源码!(因为对红黑树不了解,所以..,等了解了红黑树之后再唠唠)!
再之后就是获取数据: get
/** * Returns the value to which the specified key is mapped, * or {@code null} if this map contains no mapping for the key. * * <p>More formally, if this map contains a mapping from a key * {@code k} to a value {@code v} such that {@code (key==null ? k==null : * key.equals(k))}, then this method returns {@code v}; otherwise * it returns {@code null}. (There can be at most one such mapping.) * * <p>A return value of {@code null} does not <i>necessarily</i> * indicate that the map contains no mapping for the key; it's also * possible that the map explicitly maps the key to {@code null}. * The {@link #containsKey containsKey} operation may be used to * distinguish these two cases. * * @see #put(Object, Object) */ public V get(Object key) { Node<K,V> e; return (e = getNode(hash(key), key)) == null ? null : e.value; }
/** * Implements Map.get and related methods * * @param hash hash for key * @param key the key * @return the node, or null if none */ final Node<K,V> getNode(int hash, Object key) { Node<K,V>[] tab; Node<K,V> first, e; int n; K k; if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) { if (first.hash == hash && // always check first node ((k = first.key) == key || (key != null && key.equals(k)))) return first; if ((e = first.next) != null) { if (first instanceof TreeNode) return ((TreeNode<K,V>)first).getTreeNode(hash, key); do { if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } while ((e = e.next) != null); } } return null; }
首先通过传入的key获取到所属数组的下标,如果有值则返回值,没有值则返回null
在这里需要提一个方法就是
getNode(hash(key), key)
中的hash()这个方法
static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
可以看到获取hash值的时候用到了key的对象的hashCode的方法,所以如果你重写了hashCode方法,那就会出现问题
如下例子:
public class DemoController { public static void main(String[] args) { Map<Student, Object> map = new HashMap<>(); Student s = new Student(); s.setId(1L); s.setUsername("张三"); map.put(s, 1); System.out.println(map.get(s)); s.setId(2L); s.setUsername("李四"); System.out.println(map.get(s)); } } class Student { private Long id; private String username; public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getUsername() { return username; } public void setUsername(String username) { this.username = username; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Student student = (Student) o; if (id != null ? !id.equals(student.id) : student.id != null) return false; return username != null ? username.equals(student.username) : student.username == null; } @Override public int hashCode() { int result = id != null ? id.hashCode() : 0; result = 31 * result + (username != null ? username.hashCode() : 0); return result; } }
得到输出结果是:
1
null
所以这里需要注意一下!
随便提一下,在HashMap中是允许key 和 value值为null,但是在ConcurrentHashMap中是不允许(key 和 value 都不能为null)的:
final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); return null; }
(小白日志)