问题来源美团CRM后端团队:https://www.iteye.com/topic/1134016
刚学java萌新一看:只装载100个元素,本着厉行节约的原则,直接给100不就行了
一年java菜鸟一看:有坑……好像扩容因子是0.75,应该是100/0.75
两年java初级一看:想坑我~ tableSizeFor,所以最好应该是128~
三年java码农一看:还是太年轻~
其实问的就是HashMap的初始化策略,拷问的是对HashMap底层实现的掌握程度
上源码:
第一步:
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and the default load factor (0.75).
*
* @param initialCapacity the initial capacity.
* @throws IllegalArgumentException if the initial capacity is negative.
*/
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
看最后那步的构造函数
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
重点关注tableSizeFor(initialCapacity)
,这才是真正的容量
/**
* Returns a power of two size for the given target capacity.
*/
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
这步的结果是返回一个不小于传入参数并且最近的一个2的n次方的值,简而言之,如果你传1000,给你返回1024,你传1024它还是1024,你传1025,它就返回2028
详情参考:读HashMap源码之tableSizeFor
读到这大概明白了,初始化的大小并不一定是你传入值的大小,而是tableSizeFor后的值,为了迎合底层实现,就设置成128
不就ok了
128确实是个好数字,满足了tableSizeFor的要求,但是,在看一下putVal实现,你会发现
/**
* Implements Map.put and related methods
*
* @param hash hash for key
* @param key the key
* @param value the value to put
* @param onlyIfAbsent if true, don't change existing value
* @param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length; // 划重点
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
……
if (++size > threshold)
resize();// 划重点
afterNodeInsertion(evict);
return null;
上面是putVal的实现节选,两个地方划了重点,resize();
导演,切换镜头,让源码转到resize()
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
……
当put第一个值进去时,会发生什么,源码读一下
Node<K,V>[] oldTab = table; // = null
int oldCap = (oldTab == null) ? 0 : oldTab.length;// = 0
int oldThr = threshold; // 初始化时this.threshold = tableSizeFor(initialCapacity); = 128
int newCap, newThr = 0; // 都等于0
继续
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;// 上面得知=128
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {// 执行这里
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);// newThr = 128*0.75=96
}
threshold = newThr;// threshold = 96
所以通过以上代码,知道一个重要信息就是,当初始化容量为128时,它的扩容阈值threshold = 96
言外之意就是当元素put到第96个时,就会再次进行resize()方法,而resize()方法是非常耗时间的
结合我们实际情况进行一次resize()方法只为了多方4个元素,这样的损耗显然是不适合的
所以,综上所述,个人觉得初始化最好的应该是256
鉴于硬件成本已经足够低廉了,用空间换时间,值~
类似问题:面试官:”准备用HashMap存1w条数据,构造时传10000会触发扩容吗?“
读HashMap源码之tableSizeFor
HashMap面试常问的那些常量、数值