前言
G1系列主要参考的是《G1源码分析和调优》和openjdk-8u40代码
在【Java对象的创建过程】一文中也说到了对象的分配,但说的不够细,这篇再从G1的角度说下对象分配的完整过程。
对象分配图 //待补充
快速分配
TLAB的目的就是为了快速分配。因为堆是所有线程共享的,当从堆空间分配对象时,需要处理多线程并发问题。为了缓解这个问题,TLAB通过给每个线程分配缓冲区来减少使用锁。在分配线程对象时,会从堆中分配一个私有缓冲区,即TLAB。
一个Eden分区可能有多个TLAB,但一个TLAB只能在一个Eden分区。使用TLAB分配对象过程如下:
HeapWord* CollectedHeap::allocate_from_tlab(KlassHandle klass, Thread* thread, size_t size) {
HeapWord* obj = thread->tlab().allocate(size);
if (obj != NULL) {
return obj;
}
// Otherwise...
return allocate_from_tlab_slow(klass, thread, size);
}
tlab().allocate即从TLAB中分配对象,也称为指针碰撞法分配:TLAB保存top指针用于标记当前对象分配的位置,如果剩余空间大小大于待分配对象的空间大小,直接分配并更新top指针(top=top+objSize)即可。
代码如下:
inline HeapWord* ThreadLocalAllocBuffer::allocate(size_t size) {
invariants();
HeapWord* obj = top();
if (pointer_delta(end(), obj) >= size) {
size_t hdr_size = oopDesc::header_size();
Copy::fill_to_words(obj + hdr_size, size - hdr_size, badHeapWordVal);
set_top(obj + size);
invariants();
return obj;
}
return NULL;
}
这里有个问题,如何判断TLAB满了呢,总不能说一次不够此对象的分配就认为它满了吧。所以JVM有一个refill_waste参数(对应jvm参数为TLABRefillWasteFraction),默认值为64,即允许1/64的空间浪费,假设ELAB空间为1M,则剩余空间为<=16K时认为此TLAB满了。除此之外,JVM还提供了一个参数TLABWasteIncrement参数来动态增加这个值。这2个参数在运行中会不断的调整,使得系统运行状态达到最优。
我们可以通过-XX:-ResizeTLAB禁止自动调整TLAB大小,并通过-XX:TLABSize手动设置TLAB大小,不过一般不推荐这么做,因为JVM会做的更好。
如果分配失败,则在allocate_from_tlab_slow中处理:
HeapWord* CollectedHeap::allocate_from_tlab_slow(KlassHandle klass, Thread* thread, size_t size) {
// 如果tlab空间大无法discard(这个阈值由TLABRefillWasteFraction比例控制),则记录并返回null(后续会从eden区共享区域分配)
if (thread->tlab().free() > thread->tlab().refill_waste_limit()) {
//记录并使用TLABWasteIncrement更新refill_waste阈值
thread->tlab().record_slow_allocation(size);
return NULL;
}
// 重新分配一个TLAB
size_t new_tlab_size = thread->tlab().compute_size(size);
//清理老TLAB
thread->tlab().clear_before_allocation();
if (new_tlab_size == 0) {
return NULL;
}
// Allocate a new TLAB...
HeapWord* obj = Universe::heap()->allocate_new_tlab(new_tlab_size);
if (obj == NULL) {
return NULL;
}
//略
return obj;
}
这里的clear_before_allocation清理并不是回收,而是使用dummy对象(通常是int[])填充尚未分配的空间,注释上说为了Heap Parsable。具体一点是为了垃圾回收时做准备的。在垃圾回收时需要扫描对象,对有对象的地方直接根据对象的长度跳过就好,对空白地方需要一个字一个字的扫描,会很慢,所以填充一个dummy对象辅助GC快速遍历。
接下来看allocate_new_tlab分配新tlab方法:
HeapWord* G1CollectedHeap::allocate_new_tlab(size_t word_size) {
int dummy_gclocker_retry_count = 0;
return attempt_allocation(word_size, &dummy_gc_count_before, &dummy_gclocker_retry_count);
}
inline HeapWord* G1CollectedHeap::attempt_allocation(size_t word_size,
unsigned int* gc_count_before_ret,
int* gclocker_retry_count_ret) {
AllocationContext_t context = AllocationContext::current();
//优先使用CAS无锁快速并行分配
HeapWord* result = _allocator->mutator_alloc_region(context)->attempt_allocation(word_size,
//失败则尝试慢速分配
if (result == NULL) {
result = attempt_allocation_slow(word_size,
context,
gc_count_before_ret,
gclocker_retry_count_ret);
}
if (result != NULL) {
dirty_young_block(result, word_size);
}
return result;
}
继续看CAS并行分配attempt_allocation->par_allocate->par_allocate_no_bot_updates->par_allocate_impl
// This version is lock-free.
inline HeapWord* G1OffsetTableContigSpace::par_allocate_impl(size_t size,
HeapWord* const end_value) {
do {
HeapWord* obj = top();
if (pointer_delta(end_value, obj) >= size) {
HeapWord* new_top = obj + size;
HeapWord* result = (HeapWord*)Atomic::cmpxchg_ptr(new_top, top_addr(), obj);
// result can be one of two:
// the old top value: the exchange succeeded
// otherwise: the new value of the top is returned.
if (result == obj) {
assert(is_aligned(obj) && is_aligned(new_top), "checking alignment");
return obj;
}
} else {
return NULL;
}
} while (true);
}
失败则进行慢速分配attempt_allocation_slow:
HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size,
AllocationContext_t context,
unsigned int *gc_count_before_ret,
int* gclocker_retry_count_ret) {
HeapWord* result = NULL;
for (int try_count = 1; /* we'll return */; try_count += 1) {
bool should_try_gc;
unsigned int gc_count_before;
{
//加锁分配
MutexLockerEx x(Heap_lock);
result = _allocator->mutator_alloc_region(context)->attempt_allocation_locked(word_size,
false /* bot_updates */);
if (result != NULL) {
return result;
}
// If we reach here, attempt_allocation_locked() above failed to
// allocate a new region. So the mutator alloc region should be NULL.
//分配失败,尝试新生代分区再尝试分配
if (GC_locker::is_active_and_needs_gc()) {
if (g1_policy()->can_expand_young_list()) {
// No need for an ergo verbose message here,
// can_expand_young_list() does this when it returns true.
result = _allocator->mutator_alloc_region(context)->attempt_allocation_force(word_size,
false /* bot_updates */);
if (result != NULL) {
return result;
}
}
should_try_gc = false;
} else {
if (GC_locker::needs_gc()) {
should_try_gc = false;
} else {
// Read the GC count while still holding the Heap_lock.
gc_count_before = total_collections();
should_try_gc = true;
}
}
}
//还是分配失败,尝试GC
if (should_try_gc) {
bool succeeded;
//GCLocker没有进入临界区,可以进行垃圾回收,
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_inc_collection_pause);
if (result != NULL) {
return result;
}
if (succeeded) {
//稍后进行垃圾回收,先返回
MutexLockerEx x(Heap_lock);
*gc_count_before_ret = total_collections();
return NULL;
}
} else {
//判断是否达到分配次数阈值
if (*gclocker_retry_count_ret > GCLockerRetryAllocationCount) {
MutexLockerEx x(Heap_lock);
*gc_count_before_ret = total_collections();
return NULL;
}
GC_locker::stall_until_clear();
(*gclocker_retry_count_ret) += 1;
}
// 其他线程可能正在分配或者GCLocker正在竞争使用,在加锁分配钱再尝试无锁分配
result = _allocator->mutator_alloc_region(context)->attempt_allocation(word_size,
false /* bot_updates */);
if (result != NULL) {
return result;
}
if ((QueuedAllocationWarningCount > 0) &&
(try_count % QueuedAllocationWarningCount == 0)) {
warning("G1CollectedHeap::attempt_allocation_slow() "
"retries %d times", try_count);
}
}
ShouldNotReachHere();
return NULL;
}
注:GCLocker是与JNI相关的。在访问JNI代码时,可能会进入临界区,此时不会进行GC垃圾回收。
以上的逻辑还是TLAB相关的,如果全失败的话,还会尝试会从堆中直接分配:
HeapWord*
G1CollectedHeap::mem_allocate(size_t word_size,
bool* gc_overhead_limit_was_exceeded) {
assert_heap_not_locked_and_not_at_safepoint();
// Loop until the allocation is satisfied, or unsatisfied after GC.
for (int try_count = 1, gclocker_retry_count = 0; /* we'll return */; try_count += 1) {
unsigned int gc_count_before;
HeapWord* result = NULL;
if (!isHumongous(word_size)) {
result = attempt_allocation(word_size, &gc_count_before, &gclocker_retry_count);
} else {//大对象在humongous区分配,即老年代
result = attempt_allocation_humongous(word_size, &gc_count_before, &gclocker_retry_count);
}
if (result != NULL) {
return result;
}
// 分配不成功,则进行GC垃圾回收,FULLGC
VM_G1CollectForAllocation op(gc_count_before, word_size);
op.set_allocation_context(AllocationContext::current());
// ...and get the VM thread to execute it.
VMThread::execute(&op);
if (op.prologue_succeeded() && op.pause_succeeded()) {
HeapWord* result = op.result();
if (result != NULL && !isHumongous(word_size)) {
dirty_young_block(result, word_size);
}
return result;
} else {
//分配失败次数达到阈值
if (gclocker_retry_count > GCLockerRetryAllocationCount) {
return NULL;
}
assert(op.result() == NULL,
"the result should be NULL if the VM op did not succeed");
}
}
再看看大对象的分配:
//和TLAB慢速分配过程差不多,主要就是对象大小不一样
HeapWord* G1CollectedHeap::attempt_allocation_humongous(size_t word_size,
unsigned int * gc_count_before_ret,
int* gclocker_retry_count_ret) {
//尝试垃圾回收,这里是增量回收,启动并发标记
if (g1_policy()->need_to_start_conc_mark("concurrent humongous allocation",
word_size)) {
collect(GCCause::_g1_humongous_allocation);
}
HeapWord* result = NULL;
for (int try_count = 1; /* we'll return */; try_count += 1) {
bool should_try_gc;
unsigned int gc_count_before;
{
MutexLockerEx x(Heap_lock);
// 尝试分配对象,如果对象大到一个分区装不下,则加锁分配连续堆分区
result = humongous_obj_allocate(word_size, AllocationContext::current());
if (result != NULL) {
return result;
}
if (GC_locker::is_active_and_needs_gc()) {
should_try_gc = false;
} else {
if (GC_locker::needs_gc()) {
should_try_gc = false;
} else {
// Read the GC count while still holding the Heap_lock.
gc_count_before = total_collections();
should_try_gc = true;
}
}
}
//失败则再尝试垃圾回收
if (should_try_gc) {
bool succeeded;
result = do_collection_pause(word_size, gc_count_before, &succeeded,
GCCause::_g1_humongous_allocation);
if (result != NULL) {
assert(succeeded, "only way to get back a non-NULL result");
return result;
}
if (succeeded) {
MutexLockerEx x(Heap_lock);
*gc_count_before_ret = total_collections();
return NULL;
}
} else {
//分配失败达到阈值,则分配失败
if (*gclocker_retry_count_ret > GCLockerRetryAllocationCount) {
MutexLockerEx x(Heap_lock);
*gc_count_before_ret = total_collections();
return NULL;
}
GC_locker::stall_until_clear();
(*gclocker_retry_count_ret) += 1;
}
}
ShouldNotReachHere();
return NULL;
}
最后还有个尝试分配代码(未debug进此分支)
HeapWord*
G1CollectedHeap::satisfy_failed_allocation(size_t word_size,
AllocationContext_t context,
bool* succeeded) {
*succeeded = true;
// Let's attempt the allocation first.
// 执行GC前先尝试分配一次
HeapWord* result =
attempt_allocation_at_safepoint(word_size,
context,
false /* expect_null_mutator_alloc_region */);
if (result != NULL) {
assert(*succeeded, "sanity");
return result;
}
//尝试扩展堆分区并分配
result = expand_and_allocate(word_size, context);
if (result != NULL) {
return result;
}
// Expansion didn't work, we'll try to do a Full GC.
// 扩展失败,进行FullGC
bool gc_succeeded = do_collection(false, /* explicit_gc */
false, /* clear_all_soft_refs */ 不回收软引用
word_size);
if (!gc_succeeded) {
*succeeded = false;
return NULL;
}
// Retry the allocation
//再次尝试分配
result = attempt_allocation_at_safepoint(word_size,
context,
true /* expect_null_mutator_alloc_region */);
if (result != NULL) {
assert(*succeeded, "sanity");
return result;
}
// Then, try a Full GC that will collect all soft references.
// 进行fullgc,回收软引用
gc_succeeded = do_collection(false, /* explicit_gc */
true, /* clear_all_soft_refs */
word_size);
if (!gc_succeeded) {
*succeeded = false;
return NULL;
}
// Retry the allocation once more
//最后一次尝试分配
result = attempt_allocation_at_safepoint(word_size,
context,
true /* expect_null_mutator_alloc_region */);
if (result != NULL) {
assert(*succeeded, "sanity");
return result;
}
return NULL;
}