集合

1.Collection

集合的根接口，所有的实现类都提供两个构造方法：

构造空Collection
现有的复制（浅复制）
所有的list和set都可以存放null值，map中线程安全的不允许存放null值（key和value都不行），线程不安全的可以
所有不包含重复对象的集合，用来存放自定义对象都需要重写hashCode()以及equals()方法

集合族谱

2.List

有序序列，允许重复（具体原因见下面具体实现类的底层结构）

2.1.ArrayList

数据结构
底层是数组结构，查询快，增删慢，线程不安全，ArrayList 是 List 接口的可变数组的实现。

/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer.
*/
private transient Object[] elementData;

如何实现动态数组
每个 ArrayList 实例都有一个容量，该容量是指用来存储列表元素的数组的大小。它总是至少等于列表的大小。初始化为10。构造方法如下：

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this(10);
    }
    /**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        super();
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        this.elementData = new Object[initialCapacity];
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        size = elementData.length;
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    }

在 ArrayList 的存储方法，其核心本质是在数组的某个位置将元素添加进入。但其中又会涉及到关于数组容量不够而增长等因素。add(E e）源码如下：

    /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }
    private void ensureCapacityInternal(int minCapacity) {
        modCount++;
        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

容量不够时，通过grow（）方法进行扩容，数组进行扩容时，会将老数组中的元素重新拷贝一份到新的数组中，通过Arrays.copyOf(elementData, newCapacity)方法扩充原数组，每次数组容量的增长大约是其原容量的 1.5 倍（从int newCapacity = oldCapacity + (oldCapacity >> 1)这行代码得出）。这种操作的代价是很高的，因此在实际使用时，我们应该尽量避免数组容量的扩张。当我们可预知要保存的元素的多少时，要在构造 ArrayList 实例时，就指定其容量，以避免数组扩容的发生。或者根据实际需求，通过调用ensureCapacity 方法来手动增加 ArrayList 实例的容量。

2.2.LinkedList

数据结构
LinkedList底层是双向链表结构，所以在类中包含了 first 和 last 两个指针(Node)。Node 中包含了上一个节点和下一个节点的引用，这样就构成了双向的链表。每个 Node 只能知道自己的前一个节点和后一个节点，节点类如下：

    transient int size = 0;
    transient Node<E> first; //链表的头指针
    transient Node<E> last; //尾指针
    //存储对象的结构 Node, LinkedList的内部类
    private static class Node<E> {
        E item;
        Node<E> next; // 指向下一个节点
        Node<E> prev; //指向上一个节点

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

特别的方法
descendingIterator()：返回降序迭代器（底层的两个双向指针）

2.3.Vector

底层是数组结构，线程安全（每个操作元素的方法上都有Synchronized修饰符）
get代码如下：

    /**
     * Returns the element at the specified position in this Vector.
     *
     * @param index index of the element to return
     * @return object at the specified index
     * @throws ArrayIndexOutOfBoundsException if the index is out of range
     *            ({@code index < 0 || index >= size()})
     * @since 1.2
     */
    public synchronized E get(int index) {
        if (index >= elementCount)
            throw new ArrayIndexOutOfBoundsException(index);

        return elementData(index);
    }

和ArrayList和Vector一样，同样的类似关系的类还有HashMap和HashTable，StringBuilder和StringBuffer，后者是前者线程安全版本的实现。

3.Set

无索引的概念，因此方法中没有和索引有关的方法，不允许重复元素

3.1.HashSet

对于 HashSet 而言，它是基于 HashMap 实现的，HashSet 底层使用 HashMap 来保存所有元素，因此 HashSet 的实现比较简单，相关 HashSet 的操作，基本上都是直接调用底层 HashMap 的相关方法来完成，我们应该为保存到 HashSet 中的对象覆盖 hashCode() 和 equals()

/**
 * 默认的无参构造器，构造一个空的HashSet。
 *
 * 实际底层会初始化一个空的HashMap，并使用默认初始容量为16和加载因子0.75。
 */
public HashSet() {
    map = new HashMap<E,Object>();
}

/**
 * 构造一个包含指定collection中的元素的新set。
 *
 * 实际底层使用默认的加载因子0.75和足以包含指定collection中所有元素的初始容量来创建一个HashMap。
 * @param c 其中的元素将存放在此set中的collection。
 */
public HashSet(Collection<? extends E> c) {
    map = new HashMap<E,Object>(Math.max((int) (c.size()/.75f) + 1, 16));
    addAll(c);
}

/**
 * 以指定的initialCapacity和loadFactor构造一个空的HashSet。
 *
 * 实际底层以相应的参数构造一个空的HashMap。
 * @param initialCapacity 初始容量。
 * @param loadFactor 加载因子。
 */
public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<E,Object>(initialCapacity, loadFactor);
}

/**
 * 以指定的initialCapacity构造一个空的HashSet。
 *
 * 实际底层以相应的参数及加载因子loadFactor为0.75构造一个空的HashMap。
 * @param initialCapacity 初始容量。
 */
public HashSet(int initialCapacity) {
    map = new HashMap<E,Object>(initialCapacity);
}

/**
 * 以指定的initialCapacity和loadFactor构造一个新的空链接哈希集合。此构造函数为包访问权限，不对外公开，
 * 实际只是是对LinkedHashSet的支持。
 *
 * 实际底层会以指定的参数构造一个空LinkedHashMap实例来实现。
 * @param initialCapacity 初始容量。
 * @param loadFactor 加载因子。
 * @param dummy 标记。
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<E,Object>(initialCapacity, loadFactor);
}

由于 HashMap 的 put() 方法添加 key-value 对时，当新放入 HashMap 的 Entry 中 key 与集合中原有 Entry 的 key 相同（hashCode()返回值相等，通过 equals 比较也返回 true），新添加的 Entry 的 value 会将覆盖原来 Entry 的 value（HashSet 中的 value 都是PRESENT），但 key 不会有任何改变，因此如果向 HashSet 中添加一个已经存在的元素时，新添加的集合元素将不会被放入 HashMap中，原来的元素也不会有任何改变，这也就满足了 Set 中元素不重复的特性。

3.2.LinkedHashSet

内部构造了一个LinkedHashMap，基本都是LinkedHashMap的方法，见下文LinkedHashMap详解

3.3.TreeSet

线程安全的Set集合

4.Collections

操作集合的包装类

addAll(Collection<T> c,T... elements)
sort(List<T> list) T必须实现排序接口comparable，并重写方法
sort(List<T> list，Comparator<? super T> c) ，根据第一个参数小于、等于或大于第二个参数分别返回负整数、零或正整数。
comparable和comparator区别：comparable需要实体类自己实现，comparator无法修改实体类时，直接在调用方创建

5.Map

键值对集合

5.1.HashMap

HashMap实际上是一个“链表散列”的数据结构，即数组和链表的结合体。处理冲突的方法为拉链法，依据对象的哈希值求余存入数组中，哈希值不同的直接存入数组，如果哈希值相同则调用equals方法判断元素是否相同，不相同则追加在链表上。允许存放null元素，HashMap在 put 的时候是根据 key 的 hashcode 进行 hash 然后放入对应的地方。
构造方法创建一个entry数组，entry类中有个next引用，用来实现链表。新加入的放在链头，最先加入的放在链尾。
存放方法：数组中存放Map.Entry对象引用，Map.Entrynew出来存放再堆中，Map.Entry之间通过引用实现链表。

5.2.LinkedHashMap

底层为链表散列加链表结构，多了一条链表（记录元素的存储顺序），保证元素有序。

源码解析

LinkedHashMap 是 HashMap 的一个子类，它保留插入的顺序，如果需要输出的顺序和输入时的相同，那么就选用 LinkedHashMap。通过构造方法来控制访问顺序。允许存放<null,null>；

构造方法如下：

private final boolean accessOrder;
/**
 * Constructs an empty <tt>LinkedHashMap</tt> instance with the
 * specified initial capacity, load factor and ordering mode.
 *
 * @param  initialCapacity 初始容量
 * @param  loadFactor      装填因子（影响底层哈希数组长度，进而影响冲突的数量）
 * @param  accessOrder     如果为true，则按照访问顺序；如果为false，则按照插入顺序
 */
public LinkedHashMap(int initialCapacity,
                     float loadFactor,
                     boolean accessOrder) {
    super(initialCapacity, loadFactor);
    this.accessOrder = accessOrder;
}

实现控制访问顺序的原理：
定义了entry对象前后指针，LinkedHashMap 重写了父类 HashMap 的 get 方法，实际在调用父类 getNode() 方法取得查找的元素后，再判断当排序模式 accessOrder 为 true 时，记录访问顺序。由于的链表的增加、删除操作是常量级的，故并不会带来性能的损失。源码如下：

/**
* LinkedHashMap的Entry元素。
* 继承HashMap的Entry元素，又保存了其上一个元素before和下一个元素after的引用。
 */
private static class Entry<K,V> extends HashMap.Entry<K,V> {
    Entry<K,V> before, after;
    ……
}
/**
* 获取key对应的value
*/
public V get(Object key) {
    Node<K,V> e;
    if ((e = getNode(hash(key), key)) == null)
        return null;
    if (accessOrder)
        //将元素移动到数组末尾
        afterNodeAccess(e);
    return e.value;
}

// move node to last
void afterNodeAccess(Node<K,V> e) { 
    LinkedHashMap.Entry<K,V> last;
    if (accessOrder && (last = tail) != e) {
        LinkedHashMap.Entry<K,V> p =
            (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        p.after = null;
        if (b == null)
            head = a;
        else
            b.after = a;
        if (a != null)
            a.before = b;
        else
            last = b;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
        tail = p;
        ++modCount;
    }
}

5.3.ConcurrentHashMap

源码解析

在JDK1.7中ConcurrentHashMap采用了数组+Segment+分段锁的方式实现。

Segment(分段锁)
ConcurrentHashMap中的分段锁称为Segment，它即类似于HashMap的结构，即内部拥有一个Entry数组，数组中的每个元素又是一个链表,同时又是一个ReentrantLock（Segment继承了ReentrantLock）。
内部结构
ConcurrentHashMap使用分段锁技术，将数据分成一段一段的存储，然后给每一段数据配一把锁，当一个线程占用锁访问其中一个段数据的时候，其他段的数据也能被其他线程访问，能够实现真正的并发访问。如下图是ConcurrentHashMap的内部结构图：

ConcurrentHashMap结构图

从上面的结构我们可以了解到，ConcurrentHashMap定位一个元素的过程需要进行两次Hash操作。
第一次Hash定位到Segment，第二次Hash定位到元素所在的链表的头部。
该结构的优劣势
坏处：这一种结构的带来的副作用是Hash的过程要比普通的HashMap要长
好处：写操作的时候可以只对元素所在的Segment进行加锁即可，不会影响到其他的Segment，这样，在最理想的情况下，ConcurrentHashMap可以最高同时支持Segment数量大小的写操作（刚好这些写操作都非常平均地分布在所有的Segment上）。所以，通过这一种结构，ConcurrentHashMap的并发能力可以大大的提高。

JDK1.8

JDK1.7：ReentrantLock+Segment+HashEntry
JDK1.8：Synchronized+CAS+Node+红黑树

JDK1.8屏蔽了JDK1.7中的Segment概念呢，而是直接使用「Node数组+链表+红黑树」的数据结构来实现，并发控制采用「Synchronized + CAS机制」来确保安全性
并发下的处理：由于每一个Node的首节点都会被synchronized修饰，从而将一个元素的新增转化为一个原子操作，同时Node的value和next都是由volatile关键字进行修饰，从而可以保证可见性。
JDK1.8锁的粒度就是HashEntry（首节点），也就是1.8中加锁力度更低了，在粗粒度加锁中 ReentrantLock 可能通过 Condition 来控制各个低粒度的边界，更加的灵活，而在低粒度中，Condition的优势就没有了，所以使用内置的 Synchronize 并不比ReentrantLock效果差。

红黑树转换结构图

5.4.HashTable

底层也是哈希表，线程安全的，不能存储null值

结构图

5.5.Map.Entry内部类

每一个键值对都对应一个Map.Entry对象

5.6.遍历Map集合

Set<K> keySet()
Set<Map.Entry<K,V>> entrySet()

集合总结