java HashMap 原理

基于jdk 1.6 的HashMap

都知道HashMap 内部结构是数组+链表,但是一般正常插入很少会出现链表,因为hash 不同,这里模拟一下hash 相同的情况

参考: http://www.importnew.com/28263.html
参考: https://blog.csdn.net/v123411739/article/details/78996181
HashMap的wirteObject私有化参考: http://www.a-site.cn/article/140346.html

hash 并发死循环参考: https://blog.csdn.net/bigtree_3721/article/details/77123701

问题:

1. HashMap 为什么将 wirteObject 和 readObject 方法,私有化

{@see ObjectOutputStream} 的 writeObject 方法可看出,如果其他类有自己的writeObject 方法,会调用的自己的writeObject 方法,wirteObject0 -> writeOrdinaryObject -> writeSerialData() 最后执行自己的writeObject 方法

2. 怎么才能让entry 中形成链表结构?

只有两个key值不相等但hash 值相等,这样才能形成hash 碰撞,才会在 Entry 中形成链表结构,因为是否添加新的键值对是根据key 的hash 值来判断的
正常情况下, 如果table 表中的key值相等,则覆盖原value 值,如果不相等,则添加新的键值对
如果key 值不相等,但是hash 值相等(table 中索引是根据hash 值来计算的,如果hash 值 ),所以在添加key-value 时,会先查出table 表中原索引位置的key-value 值,然后用 Entry.next 纪录下这个值,这个时候链表就形成了

3. 形成链表时,链表中元素的顺序是怎样的

jdk1.6,jdk1.7: 这个时候,新的键值会覆盖在entry 中的value 值上,而原来的值则会纪录在 entry.next 中,就形成了链表结构 (jdk 1.6 新值.next = 旧值.next = 最开始的值)
jdk1.8: 与jdk 1.6 相反: 最开始的值.next = 新值.next = 最新值

4. 常见的基于hash 碰撞的攻击,

黑客短时间向某一个网站后台插入大量hash 值相同的数据,造成链表过长,取数据的时候,我们都知道链表增删快,查询慢,短时间插入大量数据,就会造成,网站后台运行缓慢,甚至宕机

基于这个问题:减少hash 相同值,严格使用泛型,确定Map的数据类型,为了增加查找效率,jdk 1.8 HashMap 底层实现由数组+链表,转化为数组+链表+红黑色结构,当链表值大于8时,链表会被拆分为红黑色结构

模拟hash碰撞

这里选择3个key值, String 1, char '1',int 49,这三个key 值得hash 都是 49, null 得hash 总是0,因此只能有一个null键,可以有多个null值, 当get()方法返回null值时，可能是 HashMap中没有该键，也可能使该键所对应的值为null。因此，在HashMap中不能由get()方法来判断HashMap中是否存在某个键，而应该用containsKey()方法来判断。

/**
	 * 计算map 的 key 的hash 
	 * @param key
	 * @return
	 */
	 static final int hash(Object key) {
	        int h;
	        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);// 可以看出如果map 有多个null, 这里返回的hash 则总是0
	 }

HashMap中的添加key-value 的方法,基于jdk1.6

void createEntry(int hash, K key, V value, int bucketIndex) {
	        Entry<K,V> e = table[bucketIndex];
	        table[bucketIndex] = new Entry<>(hash, key, value, e);
	        size++;
	    }

测试hash 碰撞的例子,在debug 查看时,可以发现map 中有8个数据,但hashMap 的table 属性中只有5个值,然后在点开具体的table项,可以发现被next关联的链表结构

package test.map;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.util.HashMap;
import java.util.Map;


/**
 * 基于jdk 1.6 的HashMap
 * 
 * 参考: http://www.importnew.com/28263.html
 * 参考: https://blog.csdn.net/v123411739/article/details/78996181
 * HashMap的wirteObject私有化 参考: http://www.a-site.cn/article/140346.html
 * 
 * 1. HashMap 为什么将 wirteObject 和 readObject 方法,私有化
 * {@see ObjectOutputStream} 的 writeObject 方法可看出,如果 其他类有自己的writeObject 方法,会调用的自己的writeObject 方法,wirteObject0 -> writeOrdinaryObject -> writeSerialData() 最后执行自己的writeObject 方法
 * 
 * 2. 怎么才能让entry 中形成链表结构?
 * 只有两个key值不相等但hash 值相等,这样才能形成hash 碰撞,才会在 Entry 中形成链表结构,因为是否添加新的键值对是根据key 的hash 值来判断的
 * 正常情况下, 如果table 表中的key值相等,则覆盖原value 值,如果不相等,则添加新的键值对
 * 如果key 值不相等,但是hash 值相等(table 中 索引是根据hash 值来计算的,如果hash 值 ),所以在添加key-value 时,会先查出table 表中原索引位置的key-value 值,然后用 Entry.next 纪录下这个值,这个时候链表就形成了
 * 
 * 3. 形成链表时,链表中元素的顺序是怎样的
 * jdk1.6,jdk1.7: 这个时候,新的键值会覆盖在entry 中的value 值上,而原来的值则会纪录在 entry.next 中,就形成了链表结构 (jdk 1.6 新值.next = 旧值.next = 最开始的值)
 * jdk1.8: 与jdk 1.6 相反: 最开始的值.next = 新值.next = 最新值
 * 
 * @author 12198
 *
 */ 
public class TestHashMap extends HashMap{
	
	
	public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException {
		String path = "D://hashMap.data";
		HashMap map = (HashMap) getMap();
		int hash = hash("3");
//		System.out.println(hash);// hash 值是51
		int indexFor = indexFor(hash, 8);
//		System.out.println(indexFor);//算出索引
//		map.put("3", "6");
		System.out.println(map.size());
//		System.out.println(hash("1"));//算出是49
		hashBreak(map);
	}
	
	
	
	/**
	 * 模拟hash 碰撞
	 * 只有两个key值不相等但hash 值相等,这样才能形成hash 碰撞,才会在 Entry 中形成链表结构,
	 * 
	 * '1' = 49
	 * 这两者的hash值dous  49; hash('1') = hash(49);
	 * 
	 */
	@SuppressWarnings("unchecked")
	public static void hashBreak(@SuppressWarnings("rawtypes") HashMap hashMap) {
		hashMap.put('1', "234");
		hashMap.put(49, "234");
		System.out.println(hashMap.size());
		System.out.println(hashMap.toString());
	}




	private static Map getMap() {
		HashMap<String,String> map = new HashMap<String, String>();
		map.put("1", "2");
		map.put(null, "8");
		map.put("2", "8");
		map.put("3", "8");
		map.put("4", "8");
		map.put("5", "8");
//		String put = map.put("5", "9");
//		System.out.println(put);
//		System.out.println(map.get("5"));
		return map;
	}
	
	
	
	
	//问题一,测试HashMap 的序列化 和 反序列化 =======================================
	public static  void writeObject(Map map,String path) throws FileNotFoundException, IOException{
		ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(path));
		out.writeObject(map);
		out.close();
	}
	
	public static Map readObject(String path) throws FileNotFoundException, IOException, ClassNotFoundException{
		ObjectInputStream input = new ObjectInputStream(new FileInputStream(path));
		Map map = (Map) input.readObject();
		return map;
	}
	
	
	//测试 hashMap 的hash
	/**
	 * 计算map 的 key 的hash 
	 * @param key
	 * @return
	 */
	 static final int hash(Object key) {
	        int h;
	        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);// 可以看出如果map 有多个null, 这里返回的hash 则总是0
	 }
	 
	 
	 
	

}

基于jdk1.8 的HashMap

新增了红黑树结构(TreeNode extends LinkedHashMap.Entry<K,V> extends HashMap.Node<K,V>) 实际上还是继承的Entry 结构的Node 节点, 当某个链表长度大于8 的时候(见 HashMap.TREEIFY_THRESHOLD 的注释)会扩容更改为TreeNode 结构,底层转换为数组 + 链表 + 红黑树结构

/**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;//如果是第一次put,则扩展map的容量
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);//如果原始节点没有值,则直接 增加一个Node 节点 即可
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode) //如果 p 属于 TreeNode而不是 Node 则直接增加 TreeNode 节点
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st //查看TREEIFY_THRESHOLD 属性的注释得知, 当链表的长度最少要为8,才会执行下面的操作
                            treeifyBin(tab, hash);//扩容,将普通Node 节点 扩展为红黑树节点 ,调用replacementTreeNode 方法,
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

treeNode 节点的继承关系,TreeNode extends LinkedHashMap.Entry<K,V> extends HashMap.Node<K,V>

before, after 属性来源于 LinkedHashMap.Entry (双向链表,可以从两边开始查询)

  /**
     * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
     * extends Node) so can be used as extension of either regular or
     * linked node.
     */
    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);
        }

查找相同hash 值,在long范围内,传入long 的最大值 long l = 9223372036854775806L;在 Long 类型范围内查找 hash=49 的数值,可以查出40-50个以上,具体有多少,没等程序跑完

private static void findHash(long num) {
//		Random random = new Random();
		long i = 0;
		while( i < num) {
//			int nextInt = random.nextInt(num);
			long nextInt = i;
			long hash;
			if((hash = (new Long(nextInt)).hashCode()) == 49) {
				System.out.println("value="+ nextInt + ",hash = " + hash);
			}
//			System.out.println("value="+ nextInt + ",hash = " + hash);
			i++;
			
		}
	}

get(key)方法分析

public V get(Object key) {
	        Node<K,V> e;
	        return (e = getNode(hash(key), key)) == null ? null : e.value;
	    }

	
	 final Node<K,V> getNode(int hash, Object key) {
	        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
	        if ((tab = table) != null && (n = tab.length) > 0 &&
	            (first = tab[(n - 1) & hash]) != null) {//根据hash 算出对应node 数组张的索引,并判断是否是要寻找的值
	            if (first.hash == hash && // always check first node
	                ((k = first.key) == key || (key != null && key.equals(k))))
	                return first;
	            if ((e = first.next) != null) {//继续寻找这个索引位置的链表中的下一个元素,如果不为null.进行下一步判断
	                if (first instanceof TreeNode)// 判断下一个节点node 是否属于TreeNode节点, 如果是转化为TreeNode ,并得到这个节点,返回
	                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);// 在 红黑树结构中,查找根节点,如果没找到,则以当前节点为根节点搜索,
	                do {//如果是普通的 node 节点,则采用 do while 循环查找next 节点,直至找到被寻找的节点
	                    if (e.hash == hash &&
	                        ((k = e.key) == key || (key != null && key.equals(k))))
	                        return e;
	                } while ((e = e.next) != null);
	            }
	        }
	        return null;
	    }