关于HashMap的一些理解

首先说说hashmap的数据结构 hashmap的数据结构是数组+链表的存储结构结构如下

1564537569(1).jpg

大致可以理解为是一个entry()的数组entry中维护key value hash 和next属性，entry是按列表排列 next指向下一个，当hashmap进行存储值得时候，首先对key进行hash，hash值相同得会放在一个链表中，然后再链表中用key.equls方法进行匹配具体得源码如下

public V get(Object key) {
     if (key == null)
         //返回table[0] 的value值
         return getForNullKey();
     Entry<K,V> entry = getEntry(key);
     return null == entry ? null : entry.getValue();
}
final Entry<K,V> getEntry(Object key) {
     if (size == 0) {
         return null;
     }
     int hash = (key == null) ? 0 : hash(key);
     for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
         Object k;
         if (e.hash == hash &&
             ((k = e.key) == key || (key != null && key.equals(k))))
            return e;
      }
     return null;
}

基础的put方法
在该方法中，添加键值对时，首先进行table是否初始化的判断，如果没有进行初始化（分配空间，Entry[]数组的长度）。然后进行key是否为null的判断，如果key==null ,放置在Entry[]的0号位置。计算在Entry[]数组的存储位置，判断该位置上是否已有元素，如果已经有元素存在，则遍历该Entry[]数组位置上的单链表。判断key是否存在，如果key已经存在，则用新的value值，替换点旧的value值，并将旧的value值返回。如果key不存在于HashMap中，程序继续向下执行。将key-vlaue, 生成Entry实体，添加到HashMap中的Entry[]数组中。

public V put(K key, V value) {
        if (table == EMPTY_TABLE) { //是否初始化
            inflateTable(threshold);
        }
        if (key == null) //放置在0号位置
            return putForNullKey(value);
        int hash = hash(key); //计算hash值
        int i = indexFor(hash, table.length);  //计算在Entry[]中的存储位置
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i); //添加到Map中
        return null;
}

这里涉及到一个问题首先数组得大小是固定的，也就是说hashmap在初始化的时候会给有个初始的数组长度


static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;      // HashMap初始容量大小(16)

由此可见长度为16，
由于这个问题就涉及到了hashmap扩容，扩容问题的产生
还有一个很重要的知识点就是hashMap的hash算法问题。。。

int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);

static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
 
 static int indexFor(int h, int length) {
        return h & (length-1);

可以看出这里的算法是对key进行2次hash然后用值根据table的长度经行取模。
所以当hashmap里的key很多的时候由此就会产生大量的hash碰撞（相同的hash值），
这使得在在取数的时候是o(n)的操作会降低map的性能。
扩容由以下值决定：


static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;      // HashMap初始容量大小(16) 
static final int MAXIMUM_CAPACITY = 1 << 30;               // HashMap最大容量
transient int size;                                       // The number of key-value mappings contained in this map
 
static final float DEFAULT_LOAD_FACTOR = 0.75f;          // 负载因子
 
HashMap的容量size乘以负载因子[默认0.75] = threshold;  // threshold即为开始扩容的临界值
 
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;    // HashMap的基本构成Entry

当HashMap中的元素个数超过数组大小(数组总大小length,不是数组中个数size)loadFactor时，就会进行数组扩容，loadFactor的默认值为0.75，这是一个折中的取值。也就是说，默认情况下，数组大小为16，那么当HashMap中元素个数超过160.75=12（这个值就是代码中的threshold值，也叫做临界值）的时候，就把数组的大小扩展为 2*16=32，即扩大一倍，然后重新计算每个元素在数组中的位置（这一步比较影响性能）。

0.75这个值成为负载因子，那么为什么负载因子为0.75呢？这是通过大量实验统计得出来的，如果过小，比如0.5，那么当存放的元素超过一半时就进行扩容，会造成资源的浪费；如果过大，比如1，那么当元素满的时候才进行扩容，会使get,put操作的碰撞几率增加。
扩容代码如下：


void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    //如果当前的数组长度已经达到最大值，则不在进行调整
    if (oldCapacity == MAXIMUM_CAPACITY) {
        threshold = Integer.MAX_VALUE;
        return;
    }
    //根据传入参数的长度定义新的数组
    Entry[] newTable = new Entry[newCapacity];
    //按照新的规则，将旧数组中的元素转移到新数组中
    transfer(newTable);
    table = newTable;
    //更新临界值
    threshold = (int)(newCapacity * loadFactor);
}
//旧数组中元素往新数组中迁移
void transfer(Entry[] newTable) {
    //旧数组
    Entry[] src = table;
    //新数组长度
    int newCapacity = newTable.length;
    //遍历旧数组
    for (int j = 0; j < src.length; j++) {
        Entry<K,V> e = src[j];
        if (e != null) {
            src[j] = null;
            do {
                Entry<K,V> next = e.next;
                int i = indexFor(e.hash, newCapacity);//放在新数组中的index位置
                e.next = newTable[i];//实现链表结构，新加入的放在链头，之前的的数据放在链尾
                newTable[i] = e;
                e = next;
            } while (e != null);
        }
    }

由此可见map的最大值是MAXIMUM_CAPACITY，无法无限扩容。

所以值得注意的是当hashmap在初始化的时候最好在了解业务的情况下对hashmap进行初始化长度，这样比面频繁的扩容影响性能，或者浪费资源

关于HashMap的一些理解

推荐阅读更多精彩内容