1.ArrayList简介
ArrayList继承关系如下图:
ArrayList继承了AbstractList类,并实现了List<E>, RandomAccess, Cloneable, java.io.Serializable等接口。其官方描述为 Resizable-array implementation of the List interface。阅读源码doc,如下:
* Resizable-array implementation of the <tt>List</tt> interface. Implements
* all optional list operations, and permits all elements, including
* <tt>null</tt>. In addition to implementing the <tt>List</tt> interface,
* this class provides methods to manipulate the size of the array that is
* used internally to store the list. (This class is roughly equivalent to
* <tt>Vector</tt>, except that it is unsynchronized.)
*
即用可变数组对List接口的一种实现。可以允许包括null在内的所有元素,提供了一系列通过操作内部数组大小来实现改变list的方法。等价于不加synchronized的Vector。(即Vector是加锁的ArrayList)。
* <p>The <tt>size</tt>, <tt>isEmpty</tt>, <tt>get</tt>, <tt>set</tt>,
* <tt>iterator</tt>, and <tt>listIterator</tt> operations run in constant
* time. The <tt>add</tt> operation runs in <i>amortized constant time</i>,
* that is, adding n elements requires O(n) time. All of the other operations
* run in linear time (roughly speaking). The constant factor is low compared
* to that for the <tt>LinkedList</tt> implementation.
*
size、isEmpty、get、set、iterator、listIterator这些方法都是在恒定的时间内运行,大致来说,add方法时间复杂度O(n),其他的操作时间都在线性范围内,但是稳定性不如LinkedList的实现。
* <p>Each <tt>ArrayList</tt> instance has a <i>capacity</i>. The capacity is
* the size of the array used to store the elements in the list. It is always
* at least as large as the list size. As elements are added to an ArrayList,
* its capacity grows automatically. The details of the growth policy are not
* specified beyond the fact that adding an element has constant amortized
* time cost.
<p>An application can increase the capacity of an <tt>ArrayList</tt> instance
* before adding a large number of elements using the <tt>ensureCapacity</tt>
* operation. This may reduce the amount of incremental reallocation.
*
每个ArrayList都有一个容量capacity,表示存储列表中各元素的数组的大小,这个容量至少和list的size一样大,当有新的元素被添加到ArrayList时,容量会增长扩容,扩容会带来额外的时间开销。
因此在增加大量元素之前,可以指定合适的容量以减少扩容的次数从而降低额外的时间开销。
* <p><strong>Note that this implementation is not synchronized.</strong>
* If multiple threads access an <tt>ArrayList</tt> instance concurrently,
* and at least one of the threads modifies the list structurally, it
* <i>must</i> be synchronized externally. (A structural modification is
* any operation that adds or deletes one or more elements, or explicitly
* resizes the backing array; merely setting the value of an element is not
* a structural modification.) This is typically accomplished by
* synchronizing on some object that naturally encapsulates the list.
* If no such object exists, the list should be "wrapped" using the
* {@link Collections#synchronizedList Collections.synchronizedList}
* method. This is best done at creation time, to prevent accidental
* unsynchronized access to the list:<pre>
* List list = Collections.synchronizedList(new ArrayList(...));</pre>
ArrayList不是同步的,在多线程的情况下无法保证线程安全,因此如果要在多线程下操作,必须在外部加synchronized锁。 如果没有这样的对象,请用 Collections.synchronizedList方法在创建时加锁。
* <p><a name="fail-fast">
* The iterators returned by this class's {@link #iterator() iterator} and
* {@link #listIterator(int) listIterator} methods are <em>fail-fast</em>:</a>
* if the list is structurally modified at any time after the iterator is
* created, in any way except through the iterator's own
* {@link ListIterator#remove() remove} or
* {@link ListIterator#add(Object) add} methods, the iterator will throw a
* {@link ConcurrentModificationException}. Thus, in the face of
* concurrent modification, the iterator fails quickly and cleanly, rather
* than risking arbitrary, non-deterministic behavior at an undetermined
* time in the future.
*
* <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
* as it is, generally speaking, impossible to make any hard guarantees in the
* presence of unsynchronized concurrent modification. Fail-fast iterators
* throw {@code ConcurrentModificationException} on a best-effort basis.
* Therefore, it would be wrong to write a program that depended on this
* exception for its correctness: <i>the fail-fast behavior of iterators
* should be used only to detect bugs.</i>
*
* <p>This class is a member of the
* <a href="{@docRoot}/../technotes/guides/collections/index.html">
* Java Collections Framework</a>.
当一个线程在进行iterators操作时,如果有其他的线程对ArrayList进行了修改,则会触发fail-fast机制,抛出ConcurrentModificationException异常。通常在 ListIterator#remove()或 ListIterator#add(Object) 这两种情况。但是再非同步的环境下,fail-fast并不确保每次都会触发,因此依赖此异常的代码都是错误的,这个异常仅仅只是为了发现bug。
2.实例化过程
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
/**
* Shared empty array instance used for empty instances.
*/
private static final Object[] EMPTY_ELEMENTDATA = {};
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
/**
* The array buffer into which the elements of the ArrayList are stored.
* The capacity of the ArrayList is the length of this array buffer. Any
* empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
* will be expanded to DEFAULT_CAPACITY when the first element is added.
*/
transient Object[] elementData; // non-private to simplify nested class access
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
可以看到,在ArrayList内部,定义了一个elementData变量,为Object数组,用于存储ArrayList内部元素。其默认的构造方法则初始化了一个空数组DEFAULTCAPACITY_EMPTY_ELEMENTDATA。
/**
* Constructs an empty list with the specified initial capacity.
*
* @param initialCapacity the initial capacity of the list
* @throws IllegalArgumentException if the specified initial capacity
* is negative
*/
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
}
}
当然也可以根据数据量大小指定数组容量。
3.add过程
当ArrayList增加元素时,其内部的数组会进行扩容。对其add方法进行分析:
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
可以发现,只有在调用add方法向List内添加元素的时候,会调用一个ensureCapacityInternal方法,创建一个新的内部数组:
/**
* Increases the capacity of this <tt>ArrayList</tt> instance, if
* necessary, to ensure that it can hold at least the number of elements
* specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
public void ensureCapacity(int minCapacity) {
int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
// any size if not default element table
? 0
// larger than default for default empty table. It's already
// supposed to be at default size.
: DEFAULT_CAPACITY;
if (minCapacity > minExpand) {
ensureExplicitCapacity(minCapacity);
}
}
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
重点在grow方法上,实例化的时候是一个空数组,那么当add元素的时候,ensureCapacity必须确保容量为minCapacity,初始的容量是 DEFAULT_CAPACITY,这个数值是10:
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
增加第一个元素的时候minCapacity为10,调用grow方法将内部数组扩容到10
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
对于grow函数,需要注意的是,扩容的逻辑 newCapacity = oldCapacity + (oldCapacity >> 1),采用右移及将扩容的增量为扩容之前容量除2。即其内部数组的大小增长为: 10、15、22、33、49。。。。
另外扩容的方法采用数组Arrays.copyOf方法,其底层为:
public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
@SuppressWarnings("unchecked")
T[] copy = ((Object)newType == (Object)Object[].class)
? (T[]) new Object[newLength]
: (T[]) Array.newInstance(newType.getComponentType(), newLength);
System.arraycopy(original, 0, copy, 0,
Math.min(original.length, newLength));
return copy;
}
public static native void arraycopy(Object src, int srcPos,
Object dest, int destPos,
int length);
可以发现arraycopy实际上是个native方法,动态库c实现。这也是数组拷贝的常用方法,经常出现在面试过程中。
4.remove过程
/**
* Removes the element at the specified position in this list.
* Shifts any subsequent elements to the left (subtracts one from their
* indices).
*
* @param index the index of the element to be removed
* @return the element that was removed from the list
* @throws IndexOutOfBoundsException {@inheritDoc}
*/
public E remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
调用remove函数,将指定的元素进行移除,同样也是采用的system.arraycodpy方法,将数组内部进行移动,之后将最后末尾的引用设置为null,让GC操作可以回收释放的对象。但是需要注意的是该数组并没有变小,如果需要将数组多余的空间回收,需要调用trimToSize方法:
/**
* Trims the capacity of this <tt>ArrayList</tt> instance to be the
* list's current size. An application can use this operation to minimize
* the storage of an <tt>ArrayList</tt> instance.
*/
public void trimToSize() {
modCount++;
if (size < elementData.length) {
elementData = (size == 0)
? EMPTY_ELEMENTDATA
: Arrays.copyOf(elementData, size);
}
}
也就是说,如果内存比较紧张的情况下,在对一个大的ArrayList进行了remove操作之后,最好是调用trimToSize回收部分空间。底层还是用System.arraycopy方法,这个方法也是以后对于数组的面试问题需要关注的重点。
如果再需要更深层次的实现,那么需要看jdk源码,对此方法进行分析。
5.fail-fast机制
fail-fast 机制是java集合(Collection)中的一种错误机制。它只能被用来检测错误,因为JDK并不保证fail-fast机制一定会发生。当多个线程对同一个集合的内容进行操作时,就可能会产生fail-fast事件。例如:当某一个线程A通过iterator去遍历某集合的过程中,若该集合的内容被其他线程所改变了;那么线程A访问集合时,就会抛出ConcurrentModificationException异常,产生fail-fast事件。这种“ 及时失败” 的迭代器井不是一种完备的处理机制,而只是“ 善意地” 捕获并发错误,因此只能作为并发问题的预警指示器。
其实现如下:
/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {@code iterator} and {@code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {@code ConcurrentModificationException} in
* response to the {@code next}, {@code remove}, {@code previous},
* {@code set} or {@code add} operations. This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.
*
* <p><b>Use of this field by subclasses is optional.</b> If a subclass
* wishes to provide fail-fast iterators (and list iterators), then it
* merely has to increment this field in its {@code add(int, E)} and
* {@code remove(int)} methods (and any other methods that it overrides
* that result in structural modifications to the list). A single call to
* {@code add(int, E)} or {@code remove(int)} must add no more than
* one to this field, or the iterators (and list iterators) will throw
* bogus {@code ConcurrentModificationExceptions}. If an implementation
* does not wish to provide fail-fast iterators, this field may be
* ignored.
*/
protected transient int modCount = 0;
用modCount 表示list被修改的次数,当使用iterator的时候
public Iterator<E> iterator() {
return listIterator();
}
public ListIterator<E> listIterator(final int index) {
checkForComodification();
rangeCheckForAdd(index);
final int offset = this.offset;
return new ListIterator<E>() {
int cursor = index;
int lastRet = -1;
int expectedModCount = ArrayList.this.modCount;
public boolean hasNext() {
return cursor != SubList.this.size;
}
@SuppressWarnings("unchecked")
public E next() {
checkForComodification();
int i = cursor;
if (i >= SubList.this.size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (offset + i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[offset + (lastRet = i)];
}
public boolean hasPrevious() {
return cursor != 0;
}
@SuppressWarnings("unchecked")
public E previous() {
checkForComodification();
int i = cursor - 1;
if (i < 0)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (offset + i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i;
return (E) elementData[offset + (lastRet = i)];
}
@SuppressWarnings("unchecked")
public void forEachRemaining(Consumer<? super E> consumer) {
Objects.requireNonNull(consumer);
final int size = SubList.this.size;
int i = cursor;
if (i >= size) {
return;
}
final Object[] elementData = ArrayList.this.elementData;
if (offset + i >= elementData.length) {
throw new ConcurrentModificationException();
}
while (i != size && modCount == expectedModCount) {
consumer.accept((E) elementData[offset + (i++)]);
}
// update once at end of iteration to reduce heap write traffic
lastRet = cursor = i;
checkForComodification();
}
public int nextIndex() {
return cursor;
}
public int previousIndex() {
return cursor - 1;
}
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
SubList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
expectedModCount = ArrayList.this.modCount;
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
public void set(E e) {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
ArrayList.this.set(offset + lastRet, e);
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
public void add(E e) {
checkForComodification();
try {
int i = cursor;
SubList.this.add(i, e);
cursor = i + 1;
lastRet = -1;
expectedModCount = ArrayList.this.modCount;
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
final void checkForComodification() {
if (expectedModCount != ArrayList.this.modCount)
throw new ConcurrentModificationException();
}
};
}
hasNext方法判断是否还有下一个元素,再next方法中,
public E next() {
checkForComodification();
int i = cursor;
if (i >= SubList.this.size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (offset + i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[offset + (lastRet = i)];
}
重点是通过 checkForComodification();先做了一次check操作。
final void checkForComodification() {
if (expectedModCount != ArrayList.this.modCount)
throw new ConcurrentModificationException();
}
而在开始初始化的时候已经将modCount进行了保存
int expectedModCount = ArrayList.this.modCount;
此时只需要通过 if (expectedModCount != ArrayList.this.modCount) 进行比较,如果发现不一致,则会抛出ConcurrentModificationException异常。
在实际过程中,对于此异常的解决办法主要有如下两个方案:
方案一:在遍历过程中所有涉及到改变modCount值得地方全部加上synchronized或者直接使用Collections.synchronizedList,这样就可以解决。但是不推荐,因为增删造成的同步锁可能会阻塞遍历操作。
方案二:使用CopyOnWriteArrayList来替换ArrayList。推荐使用该方案。CopyOnWriteArrayList 是ArrayList 的一个线程安全的变体,其中所有可变操作(add、set 等等)都是通过对底层数组进行一次新的复制来实现的。 该类产生的开销比较大,但是在两种情况下,它非常适合使用。1:在不能或不想进行同步遍历,但又需要从并发线程中排除冲突时。2:当遍历操作的数量大大超过可变操作的数量时。遇到这两种情况使用CopyOnWriteArrayList来替代ArrayList再适合不过了。
后续将对CopyOnWriteArrayList源码进行分析。