HashMap源码分析

源码阅读
多线程下形成死循环的原因
key为什么一般用字符串比较多，能用其他对象，或者自定义的对象吗？为什么

源码阅读

基础

1.扩容基础系数： loadFactor=0.75f
2.存储的基本数据结构：transient Entry<K,V>[] table
3.默认初始化table大小：initialCapacity =1 << 4
4.当前table元素个数：size
5.threshold=loadFactor*table大小

构造函数

public HashMap(int initialCapacity, float loadFactor) {
     //各种判断，省略...
    //这里只是做了容量和扩容系数的初始化
    this.loadFactor = loadFactor;
    threshold = initialCapacity;
    init();
}

put操作

public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
        inflateTable(threshold);//如果table为空就初始化table
    }
    if (key == null)
        return putForNullKey(value);//对key==null的情况做个特殊处理,允许key=null
    int hash = hash(key);
    int i = indexFor(hash, table.length);//根据key的hash值和当前table的长度计算该Entry位于数组中的位置
   //循环判断该Entry是否已经存在于数组第i个位置上的链表里
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }
    modCount++;
    addEntry(hash, key, value, i);//插入操作
    return null;
}

void addEntry(int hash, K key, V value, int bucketIndex) {
    //如果当前容量>=table.length*0.75f &&table的第i个位置!=null就扩容
    if ((size >= threshold) && (null != table[bucketIndex])) {
        resize(2 * table.length);//扩容和元素重排
        hash = (null != key) ? hash(key) : 0;
        bucketIndex = indexFor(hash, table.length);
    }
    createEntry(hash, key, value, bucketIndex);
}

void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }
    Entry[] newTable = new Entry[newCapacity];//创建一个新的Entry数组
    transfer(newTable, initHashSeedAsNeeded(newCapacity));//重排
    table = newTable;
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    //遍历old table中的所有元素
    for (Entry<K,V> e : table) {
        //每个元素遍历自身的链表
        while(null != e) {
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            //计算在新table中的位置，并插入
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}

get操作

get操作很简单，就是根据key的hash值定位到该key对应table中的位置，如果该key对应的hash有重复（即在table[i]上有链表结构），就挨个比较该链表上的每个节点（Entry）中key的值

public V get(Object key) {
    if (key == null)
        return getForNullKey();
    Entry<K,V> entry = getEntry(key);
    return null == entry ? null : entry.getValue();
}

final Entry<K,V> getEntry(Object key) {
    if (size == 0) {
        return null;
    }
    int hash = (key == null) ? 0 : hash(key);
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            return e;
    }
    return null;
}

图解

内部结构图

初始化和put操作

  HashMap map = new HashMap(2)
  map.put("k1","v1");
  map.put("k2","v2");

初始化

put操作(1)

put操作(2)

正常resize操作
```
  map.put("k3","v3");
```

初始状态

resize(1)

resize(2)

resize(3)

多线程线resize造成死循环

  HashMap map = new HashMap(2);
  map.put("k1","v1");
  map.put("k2","v2");
  
  
  map.put("k3","v3");//A线程操作
  map.put("k4","v4");//B线程操作

假设A线程和B线程同时进入resize方法的transfer方法的
Entry<K,V> next = e.next; 这一行
这时B线程被挂起，A线程执行完之后，B线程再执行，我们看看这时会发生什么？

初始状态

B线程被刮起前状态

A线程执行完毕

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
        while(null != e) {
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}
}

B-resize(1)

B-resize(2)

多线程下形成死循环的原因

多线程情况下，当多个线程同时对同一个hashmap进行resize操作时，就有可能造成链表的循环调用，具体原因如图解所示。

key一般用字符串比较多，能用其他对象，或者自定义的对象吗？为什么

 public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable

HashMap支持传入泛型，所以肯定支持对象，也肯定支持自定义对象（不支持基本数据类型哦），但是在用自定义对象的时候一定要注意要重写hashCode和equals方法，因为Object的equals默认是比较两个对象的地址是否相等，所以即使两个对象的属相一摸一样，在HashMap中依旧被当作是两个不同的key。

HashMap源码分析