本文首先介绍Hotspot虚拟机对线程的抽象,然后阐述线程启动的概要流程,最后详细分析线程启动用到的各个函数。
Hotspot中的线程抽象
Hotspot中按层次从低到高有如下几种线程:
- 平台特定线程
- OSThread
- JavaThread
- java.lang.Thread类实例
平台特定线程
平台特定线程指不同操作系统下具体的线程实现,hotspot将平台类型分成了BSD、Linux、Solaris和Windows等,这在hotspot的源码目录结构有所体现。
OSThread
OSThread是JVM中的一个数据结构,处于平台特定线程和JavaThread之间,对不同操作系统的线程进行了抽象和隔离。该类定义在hotspot/src/share/vm/runtime/osThread.hpp中,重要的成员变量如下:
- _interrupted表示中断状态;
- _pthread_id是pthread库pthread_create函数创建线程时返回的id;
- _thread_id是内核线程ID(NPTL支持),如LWP的id,可以通过/proc访问。
JavaThread
JavaThread是JVM中的另一个数据结构,表示一个Java线程,定义在hotspot/src/share/vm/runtime/thread.hpp中,重要的成员变量如下:
- _next是一个JavaThread*类型的指针,指向链表中下一个JavaThread;
- _threadObj是一个oop,指向调用start方法的java.lang.Thread类对象;
- 入口函数_entry_point是线程将要执行的代码;
- _jni_environment是JNI接口。
启动线程
上一篇文章提到Thread类有很多JNI方法,本文分析start0方法。由RegisterNatives方法可知,在JRE调用start0方法后会在JVM调用JVM_StartThread方法。JVM_StartThread方法定义在hotspot/src/share/vm/prims/jvm.cpp中,预处理后的相关代码如下所示。该方法的第一个参数env为JNI接口指针,第二个参数jthread是Java代码中调用start方法的java.lang.Thread类实例。
extern "C" {
void JNICALL JVM_StartThread(JNIEnv* env, jobject jthread) {
JavaThread* thread=JavaThread::thread_from_jni_environment(env);
ThreadInVMfromNative __tiv(thread); HandleMarkCleaner __hm(thread);
Thread* __the_thread__ = thread;
os::verify_stack_alignment();
JavaThread *native_thread = __null;
bool throw_illegal_thread_state = false;
{
MutexLocker mu(Threads_lock);
if (java_lang_Thread::thread(JNIHandles::resolve_non_null(jthread)) != __null) {
throw_illegal_thread_state = true;
} else {
jlong size = java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
size_t sz = size > 0 ? (size_t) size : 0;
native_thread = new JavaThread(&thread_entry, sz);
if (native_thread->osthread() != __null) {
native_thread->prepare(jthread);
}
}
}
if (throw_illegal_thread_state) {
{
Exceptions::_throw_msg(__the_thread__, vmSymbols::java_lang_IllegalThreadStateException(), __null);
return;
};
}
if (native_thread->osthread() == __null) {
delete native_thread;
if (JvmtiExport::should_post_resource_exhausted()) {
JvmtiExport::post_resource_exhausted(
JVMTI_RESOURCE_EXHAUSTED_OOM_ERROR | JVMTI_RESOURCE_EXHAUSTED_THREADS, "unable to create new native thread");
}
{
Exceptions::_throw_msg(__the_thread__, vmSymbols::java_lang_OutOfMemoryError(), "unable to create new native thread");
return;
}
}
Thread::start(native_thread);
}
}
static void thread_entry(JavaThread* thread, Thread* __the_thread__) {
HandleMark hm(__the_thread__);
Handle obj(__the_thread__, thread->threadObj());
JavaValue result(T_VOID);
JavaCalls::call_virtual(&result,
obj,
KlassHandle(__the_thread__, SystemDictionary::Thread_klass()),
vmSymbols::run_method_name(),
vmSymbols::void_method_signature(),
__the_thread__);
}
Java中线程启动的概要流程如下:
- 如果jthread参数代表的java.lang.Thread类实例(即将启动的线程)已经关联了JavaThread,那么抛出IllegalThreadStateException;
- 获取jthread参数代表的java.lang.Thread类实例的stackSize字段值,利用该值和入口函数thread_entry调用JavaThread构造函数新建对象;
- 将该JavaThread加入全局线程链表;
- 调用Thread::start启动该JavaThread。
JavaThread构造函数
JavaThread类的构造函数如下,主要做了以下几件事:
JavaThread::JavaThread(ThreadFunction entry_point, size_t stack_sz) : Thread()
, _satb_mark_queue(&_satb_mark_queue_set),
_dirty_card_queue(&_dirty_card_queue_set)
{
if (TraceThreadEvents) {
tty->print_cr("creating thread %p", this);
}
initialize();
_jni_attach_state = _not_attaching_via_jni;
set_entry_point(entry_point);
// Create the native thread itself.
// %note runtime_23
os::ThreadType thr_type = os::java_thread;
thr_type = entry_point == &compiler_thread_entry ? os::compiler_thread :
os::java_thread;
os::create_thread(this, thr_type, stack_sz);
}
- initialize函数做初始化工作,为成员变量赋默认值,如NULL或者0;
- 设置新线程的入口函数是entry_point函数指针指向的函数;
- os::create_thread方法在不同的平台有不同的实现,功能是为该JavaThread创建了OSThread和平台特定线程。
os::create_thread函数
以Linux为例,os::create_thread函数在hotspot/src/os/linux/vm/os_linux.cpp中实现,首先创建OSThread对象并将JavaThread与该OSThread做关联,然后利用pthread_create库函数创建Linux线程,再将该Linux线程关联到先前创建的OSThread。
bool os::create_thread(Thread* thread, ThreadType thr_type, size_t stack_size) {
assert(thread->osthread() == NULL, "caller responsible");
// Allocate the OSThread object
OSThread* osthread = new OSThread(NULL, NULL);
if (osthread == NULL) {
return false;
}
// set the correct thread state
osthread->set_thread_type(thr_type);
// Initial state is ALLOCATED but not INITIALIZED
osthread->set_state(ALLOCATED);
thread->set_osthread(osthread);
// init thread attributes
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
// stack size
if (os::Linux::supports_variable_stack_size()) {
// calculate stack size if it's not specified by caller
if (stack_size == 0) {
stack_size = os::Linux::default_stack_size(thr_type);
switch (thr_type) {
case os::java_thread:
// Java threads use ThreadStackSize which default value can be
// changed with the flag -Xss
assert (JavaThread::stack_size_at_create() > 0, "this should be set");
stack_size = JavaThread::stack_size_at_create();
break;
case os::compiler_thread:
if (CompilerThreadStackSize > 0) {
stack_size = (size_t)(CompilerThreadStackSize * K);
break;
} // else fall through:
// use VMThreadStackSize if CompilerThreadStackSize is not defined
case os::vm_thread:
case os::pgc_thread:
case os::cgc_thread:
case os::watcher_thread:
if (VMThreadStackSize > 0) stack_size = (size_t)(VMThreadStackSize * K);
break;
}
}
stack_size = MAX2(stack_size, os::Linux::min_stack_allowed);
pthread_attr_setstacksize(&attr, stack_size);
} else {
// let pthread_create() pick the default value.
}
// glibc guard page
pthread_attr_setguardsize(&attr, os::Linux::default_guard_size(thr_type));
ThreadState state;
{
// Serialize thread creation if we are running with fixed stack LinuxThreads
bool lock = os::Linux::is_LinuxThreads() && !os::Linux::is_floating_stack();
if (lock) {
os::Linux::createThread_lock()->lock_without_safepoint_check();
}
pthread_t tid;
int ret = pthread_create(&tid, &attr, (void* (*)(void*)) java_start, thread);
pthread_attr_destroy(&attr);
if (ret != 0) {
if (PrintMiscellaneous && (Verbose || WizardMode)) {
perror("pthread_create()");
}
// Need to clean up stuff we've allocated so far
thread->set_osthread(NULL);
delete osthread;
if (lock) os::Linux::createThread_lock()->unlock();
return false;
}
// Store pthread info into the OSThread
osthread->set_pthread_id(tid);
// Wait until child thread is either initialized or aborted
{
Monitor* sync_with_child = osthread->startThread_lock();
MutexLockerEx ml(sync_with_child, Mutex::_no_safepoint_check_flag);
while ((state = osthread->get_state()) == ALLOCATED) {
sync_with_child->wait(Mutex::_no_safepoint_check_flag);
}
}
if (lock) {
os::Linux::createThread_lock()->unlock();
}
}
// Aborted due to thread limit being reached
if (state == ZOMBIE) {
thread->set_osthread(NULL);
delete osthread;
return false;
}
// The thread is returned suspended (in state INITIALIZED),
// and is started higher up in the call chain
assert(state == INITIALIZED, "race condition");
return true;
}
该函数主要做了以下几件事:
- 创建OSThread对象,thread参数指针指向的Thread通过set_osthread方法保存了OSThread指针,这样Thread与该OSThread实现了关联,接着OSThread状态被置为初始值ALLOCATED(不是INITIALIZED);
- 调用pthread_attr系列库函数设置即将创建的Linux线程的属性,如栈大小和分离状态等;
- 利用上面设置的属性调用pthread_create库函数创建Linux线程,该线程执行的代码是java_start函数;
- OSThread利用_pthread_id字段保存了该Linux线程的线程ID,这样该Linux线程就关联到了第一步创建的OSThread;
- 等待Linux线程初始化完毕或者放弃运行,Linux线程如果正常初始化那么其所属OSThread的状态变为INITIALIZED,如果放弃运行那么其所属OSThread的状态变为ZOMBIE。
若该函数执行成功则返回true,thread参数指向的Thread对象的_osthread成员变量一定不为NULL;若该函数执行失败,如遇到线程数达到最大值或者Linux线程自己放弃运行等情况时返回false,thread参数指向的Thread对象的_osthread成员变量被置为NULL。
java_start函数
java_start函数代码如下所示:
// Thread start routine for all newly created threads
static void *java_start(Thread *thread) {
// Try to randomize the cache line index of hot stack frames.
// This helps when threads of the same stack traces evict each other's
// cache lines. The threads can be either from the same JVM instance, or
// from different JVM instances. The benefit is especially true for
// processors with hyperthreading technology.
static int counter = 0;
int pid = os::current_process_id();
alloca(((pid ^ counter++) & 7) * 128);
ThreadLocalStorage::set_thread(thread);
OSThread* osthread = thread->osthread();
Monitor* sync = osthread->startThread_lock();
// non floating stack LinuxThreads needs extra check, see above
if (!_thread_safety_check(thread)) {
// notify parent thread
MutexLockerEx ml(sync, Mutex::_no_safepoint_check_flag);
osthread->set_state(ZOMBIE);
sync->notify_all();
return NULL;
}
// thread_id is kernel thread id (similar to Solaris LWP id)
osthread->set_thread_id(os::Linux::gettid());
if (UseNUMA) {
int lgrp_id = os::numa_get_group_id();
if (lgrp_id != -1) {
thread->set_lgrp_id(lgrp_id);
}
}
// initialize signal mask for this thread
os::Linux::hotspot_sigmask(thread);
// initialize floating point control register
os::Linux::init_thread_fpu_state();
// handshaking with parent thread
{
MutexLockerEx ml(sync, Mutex::_no_safepoint_check_flag);
// notify parent thread
osthread->set_state(INITIALIZED);
sync->notify_all();
// wait until os::start_thread()
while (osthread->get_state() == INITIALIZED) {
sync->wait(Mutex::_no_safepoint_check_flag);
}
}
// call one more level start routine
thread->run();
return 0;
}
该函数主要在Linux线程中做了以下几件事:
- 利用ThreadLocalStorage类的set_thread静态函数在Linux线程的局部存储保存其所属的JavaThread指针;
- _thread_safety_check函数检查是否可以安全地启动线程,若不能则返回NULL,结果便是该Linux线程自己放弃运行,其所属的OSThread状态被置为ZOMBIE;
- 调用os::Linux::gettid函数取得该Linux线程的内核线程ID,并保存到其所属的OSThread的_thread_id字段,这里使用syscall函数和调用号186实现sys_gettid的功能,x86_64/AMD64的系统调用号可以参考这篇博文;
- 初始化信号屏蔽字、FPU等状态字;
- 所属的OSThread状态被置为INITIALIZED。
若所属的OSThread状态一直是INITIALIZED,那么Linux线程会一直在OSThread的_startThread_lock启动锁上阻塞;若状态变为RUNNABLE则执行thread指针指向的Thread对象的run函数。
线程局部存储
线程局部存储(ThreadLocalStorage、简称TLS)与具体的操作系统和处理器架构有关,公共的类定义与实现分别在文件hotspot/src/share/vm/runtime/threadLocalStorage.hpp和hotspot/src/share/vm/runtime/threadLocalStorage.cpp中,具体到x86处理器和Linux,特定的定义和实现分别在文件hotspot/src/os_cpu/linux_x86/vm/threadLS_linux_x86.hpp
和hotspot/src/os_cpu/linux_x86/vm/threadLS_linux_x86.cpp中。
- ThreadLocalStorage类有一个_thread_index静态变量,该变量主要用于在各个线程中用作键以获得每个Linux线程所属的JavaThread指针。
- 整个虚拟机启动时init函数利用pthread_key_create库函数初始化_thread_index静态变量:
void ThreadLocalStorage::init() {
assert(!is_initialized(),
"More than one attempt to initialize threadLocalStorage");
pd_init();
set_thread_index(os::allocate_thread_local_storage());
generate_code_for_get_thread();
}
int os::allocate_thread_local_storage() {
pthread_key_t key;
int rslt = pthread_key_create(&key, restore_thread_pointer);
assert(rslt == 0, "cannot allocate thread local storage");
return (int)key;
}
- ThreadLocalStorage类的set_thread静态函数实现如下,利用了pthread_setspecific库函数存储TLS:
void ThreadLocalStorage::set_thread(Thread* thread) {
pd_set_thread(thread);
// The following ensure that any optimization tricks we have tried
// did not backfire on us:
guarantee(get_thread() == thread, "must be the same thread, quickly");
guarantee(get_thread_slow() == thread, "must be the same thread, slowly");
}
void ThreadLocalStorage::pd_set_thread(Thread* thread) {
os::thread_local_storage_at_put(ThreadLocalStorage::thread_index(), thread);
}
void os::thread_local_storage_at_put(int index, void* value) {
int rslt = pthread_setspecific((pthread_key_t)index, value);
assert(rslt == 0, "pthread_setspecific failed");
}
- Thread类的current函数用于获得当前Linux线程所属的JavaThread指针,与其相关的实现代码如下,这是Linux线程局部存储的用途之一。
inline Thread* Thread::current() {
Thread* thread = ThreadLocalStorage::thread();
assert(thread != NULL, "just checking");
return thread;
}
static Thread* thread() {
return (Thread*) os::thread_local_storage_at(thread_index());
}
inline void* os::thread_local_storage_at(int index) {
return pthread_getspecific((pthread_key_t)index);
}
run函数
hotspot里Thread类的run函数代码如下,JavaThread是Thread的子类,重写了基类的run函数,该函数做了额外的初始化工作,最后调用了thread_main_inner成员函数。
void Thread::run() {
ShouldNotReachHere();
}
// The first routine called by a new Java thread
void JavaThread::run() {
// initialize thread-local alloc buffer related fields
this->initialize_tlab();
// used to test validitity of stack trace backs
this->record_base_of_stack_pointer();
// Record real stack base and size.
this->record_stack_base_and_size();
// Initialize thread local storage; set before calling MutexLocker
this->initialize_thread_local_storage();
this->create_stack_guard_pages();
this->cache_global_variables();
// Thread is now sufficient initialized to be handled by the safepoint code as being
// in the VM. Change thread state from _thread_new to _thread_in_vm
ThreadStateTransition::transition_and_fence(this, _thread_new, _thread_in_vm);
assert(JavaThread::current() == this, "sanity check");
assert(!Thread::current()->owns_locks(), "sanity check");
DTRACE_THREAD_PROBE(start, this);
// This operation might block. We call that after all safepoint checks for a new thread has
// been completed.
this->set_active_handles(JNIHandleBlock::allocate_block());
if (JvmtiExport::should_post_thread_life()) {
JvmtiExport::post_thread_start(this);
}
EventThreadStart event;
if (event.should_commit()) {
event.set_javalangthread(java_lang_Thread::thread_id(this->threadObj()));
event.commit();
}
// We call another function to do the rest so we are sure that the stack addresses used
// from there will be lower than the stack base just computed
thread_main_inner();
// Note, thread is no longer valid at this point!
}
在thread_main_inner函数里,JavaThread的entry_point函数返回了新线程的入口函数指针,上文在JVM_StartThread方法调用JavaThread构造函数时传入了入口函数指针thread_entry。
void JavaThread::thread_main_inner() {
assert(JavaThread::current() == this, "sanity check");
assert(this->threadObj() != NULL, "just checking");
// Execute thread entry point unless this thread has a pending exception
// or has been stopped before starting.
// Note: Due to JVM_StopThread we can have pending exceptions already!
if (!this->has_pending_exception() &&
!java_lang_Thread::is_stillborn(this->threadObj())) {
{
ResourceMark rm(this);
this->set_native_thread_name(this->get_thread_name());
}
HandleMark hm(this);
this->entry_point()(this, this);
}
DTRACE_THREAD_PROBE(stop, this);
this->exit(false);
delete this;
}
入口函数thread_entry
回到实现JVM_StartThread函数的jvm.cpp,入口函数thread_entry也在此文件定义,其代码如下:
static void thread_entry(JavaThread* thread, Thread* __the_thread__) {
HandleMark hm(__the_thread__);
Handle obj(__the_thread__, thread->threadObj());
JavaValue result(T_VOID);
JavaCalls::call_virtual(&result,
obj,
KlassHandle(__the_thread__, SystemDictionary::Thread_klass()),
vmSymbols::run_method_name(),
vmSymbols::void_method_signature(),
__the_thread__);
}
- thread_entry函数的两个参数都是Thread*类型的指针,实际调用时实参都是JVM_StartThread函数创建的JavaThread。
- 查看vmSymbols的类定义,可以看到如下注释:
而vmSymbols.hpp的VM_SYMBOLS_DO宏包含如下代码:// The class vmSymbols is a name space for fast lookup of // symbols commonly used in the VM. // // Sample usage: // // Symbol* obj = vmSymbols::java_lang_Object();
浅薄地理解,vmSymbols::run_method_name()返回的是run方法名,vmSymbols::void_method_signature()表示“()V”方法签名。所以JavaCalls::call_virtual这行代码的大概意思就是在thread->threadObj()句柄指向的java.lang.Thread对象上调用void run()方法。template(run_method_name, "run") template(void_method_signature, "()V")
全局线程链表
上文概要流程提到调用JavaThread构造函数新建对象后会将该JavaThread加入全局线程链表,这是通过JavaThread的prepare函数实现的,该函数代码如下所示:
void JavaThread::prepare(jobject jni_thread, ThreadPriority prio) {
assert(Threads_lock->owner() == Thread::current(), "must have threads lock");
// Link Java Thread object <-> C++ Thread
// Get the C++ thread object (an oop) from the JNI handle (a jthread)
// and put it into a new Handle. The Handle "thread_oop" can then
// be used to pass the C++ thread object to other methods.
// Set the Java level thread object (jthread) field of the
// new thread (a JavaThread *) to C++ thread object using the
// "thread_oop" handle.
// Set the thread field (a JavaThread *) of the
// oop representing the java_lang_Thread to the new thread (a JavaThread *).
Handle thread_oop(Thread::current(),
JNIHandles::resolve_non_null(jni_thread));
assert(InstanceKlass::cast(thread_oop->klass())->is_linked(),
"must be initialized");
set_threadObj(thread_oop());
java_lang_Thread::set_thread(thread_oop(), this);
if (prio == NoPriority) {
prio = java_lang_Thread::priority(thread_oop());
assert(prio != NoPriority, "A valid priority should be present");
}
// Push the Java priority down to the native thread; needs Threads_lock
Thread::set_priority(this, prio);
prepare_ext();
// Add the new thread to the Threads list and set it in motion.
// We must have threads lock in order to call Threads::add.
// It is crucial that we do not block before the thread is
// added to the Threads list for if a GC happens, then the java_thread oop
// will not be visited by GC.
Threads::add(this);
}
该函数主要做了以下几件事:
- 将参数jni_thread(即调用start方法的java.lang.Thread类对象)包装成句柄;
- set_threadObj函数将句柄保存到JavaThread的_threadObj字段;
- java_lang_Thread::set_thread函数则将JavaThread指针保存到句柄的eetop字段(即java.lang.Thread类的eetop字段);
- Threads::add函数将该JavaThread添加到全局线程链表的头部,_next指针指向原链表头。
Thread::start函数
上文概要流程的最后一步便是利用Thread::start静态方法启动JavaThread,该函数代码如下所示:
void Thread::start(Thread* thread) {
trace("start", thread);
// Start is different from resume in that its safety is guaranteed by context or
// being called from a Java method synchronized on the Thread object.
if (!DisableStartThread) {
if (thread->is_Java_thread()) {
// Initialize the thread state to RUNNABLE before starting this thread.
// Can not set it after the thread started because we do not know the
// exact thread state at that time. It could be in MONITOR_WAIT or
// in SLEEPING or some other state.
java_lang_Thread::set_thread_status(((JavaThread*)thread)->threadObj(),
java_lang_Thread::RUNNABLE);
}
os::start_thread(thread);
}
}
- is_Java_thread是Thread类的虚函数,JavaThread重写了它,返回true。句柄(即调用start方法的java.lang.Thread类对象)的threadStatus被更新为RUNNABLE(枚举值为5),这就是java.lang.Thread类中用整型值记录线程状态的原因;
- os::start_thread函数将该JavaThread关联的OSThread的状态置为RUNNABLE,上文提到过Linux线程创建后会在所属OSThread是INITIALIZED状态的条件下一直阻塞,直到状态变为RUNNABLE后才会执行java.lang.Thread类对象的run函数。
void os::start_thread(Thread* thread) {
// guard suspend/resume
MutexLockerEx ml(thread->SR_lock(), Mutex::_no_safepoint_check_flag);
OSThread* osthread = thread->osthread();
osthread->set_state(RUNNABLE);
pd_start_thread(thread);
}
以上便是Java线程的启动过程。