thread-local 是什么?
Martin Flower在《重构》中有一句经典的话:"任何一个傻瓜都能写出计算机 可以理解的程序,只有写出人类容易理解的程序才是优秀的程序员。
可见高级语言的命名有多么重要,其实语言本身就是注释。
thread-local 从字面理解就是线程本地变量
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class ThreadWithLocal extends Thread{
public ThreadWithLocal(Long t) {
this.t = t;
}
@Override public void run() {
System.out.println(this.t);
}
//thread local 线程本地
private Long t;
public static void main(String[] args) {
ThreadWithLocal t=new ThreadWithLocal(System.nanoTime());
t.start();
ThreadWithLocal t2=new ThreadWithLocal(System.nanoTime());
t2.start();
}
}
这里每个线程都是一个实例,相同实例多次start 会报异常Exception in thread "main" java.lang.IllegalThreadStateException
运行结果
3026460560368
3026460819350
vs
public
class Thread implements Runnable {
...略
/* ThreadLocal values pertaining to this thread. This map is maintained
* by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;
... 略
}
通过以上的thread 源码和示例代码分析,两个变量的本质是一样的都可以理解为thread 本地变量,其实就是thread类里的一个成员变量。
为什么jdk 要单独实现一个ThreadLocal对象?
从业务实现的角度来想,通过示例代码的场景,是可以实现线程隔离
的效果的。
但这里有一种情况实现起来有些困难(笔者自认为)。martin flower
曾经提过一个概念叫客户端程序员
,这个概念非常重要,因为这个概念笔者认为可以更容易地理解面向接口编程。作为程序的提供者比如jdk,tomcat 这些公用的框架,一般不允许使用者(客户端程序员)修改,但提供扩展能力,即 开闭原则
,比如jdk的spi 扩展点等。
第一个实例代码的实现即在thread 的子类中进行扩展,理论上可以实现,但一般对封装好的线程,修改的起来比较复杂,而且可能会破坏原有代码逻辑。一般我们的业务代码都会工作在多线程的上下文中,而对于开发者来讲是透明的,如tomcat 就是多线程。假如如下代码工作在多线程环境下,一般spring 会声明为单例,即共享变量。
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class MultiThreadShareBusiness {
private Long threadId;
public Long getThreadId() {
return threadId;
}
public void setThreadId(Long threadId) {
this.threadId = threadId;
}
public void business(){
//如果变量的值与当前线程不同,说明线程不安全
if(threadId!=Thread.currentThread().getId()) {
System.out.println(Thread.currentThread().getId() + "-" + threadId);
}
}
}
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class ThreadShareObjectTest extends Thread{
private static MultiThreadShareBusiness o=new MultiThreadShareBusiness();
public void run(){
while (true) {
o.setThreadId(Thread.currentThread().getId());
o.business();
}
}
public static void main(String[] args) {
Thread thread=new ThreadShareObjectTest();
thread.start();
Thread thread2=new ThreadShareObjectTest();
thread2.start();
}
}
有输出说明线程不安全,共享变量被两个线程同时修改。
这里有两种方法可以保证线程安全,一种加锁,第二种就是用threadLocal变量隔离。
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class MultiThreadLocalBusiness {
public static void main(String[] args) {
MultiThreadLocalBusiness m=new MultiThreadLocalBusiness();
m.setThreadId(1L);
m.business();
}
private ThreadLocal<Long> threadId = new ThreadLocal<>();
public void setThreadId(Long threadId) {
this.threadId.set(threadId);
}
public void business() {
ThreadLocal<Long> t=this.threadId;
if (t.get() != Thread.currentThread().getId()) {
System.out.println(Thread.currentThread().getId() + "-" + t.get());
}
}
}
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class ThreadLocalTest extends Thread{
private static MultiThreadLocalBusiness o=new MultiThreadLocalBusiness();
public void run(){
while (true) {
o.setThreadId(Thread.currentThread().getId());
o.business();
}
}
public static void main(String[] args) {
Thread thread=new ThreadLocalTest();
thread.start();
Thread thread2=new ThreadLocalTest();
thread2.start();
}
}
修改为threadLocal 变量后无输出,说明起到了隔离效果。
thread-local 对象的本质为thread 类中的 map 的value,对外可以提供扩展。
ThreadLocal.ThreadLocalMap threadLocals = null
线程隔离的前提条件
- 需要隔离的对象一定是共享变量。因为栈中的变量(局部变量)本身就具备隔离效果。
- 线程是共享的,一般与进程的生命周期相同。
以上两种情况下ThreadLocal 变量的线程隔离才有意义。
类图及源码分析
static class ThreadLocalMap {
/**
* The entries in this hash map extend WeakReference, using
* its main ref field as the key (which is always a
* ThreadLocal object). Note that null keys (i.e. entry.get()
* == null) mean that the key is no longer referenced, so the
* entry can be expunged from table. Such entries are referred to
* as "stale entries" in the code that follows.
*/
static class Entry extends WeakReference<ThreadLocal<?>> {
/** The value associated with this ThreadLocal. */
Object value;
Entry(ThreadLocal<?> k, Object v) {
super(k);
value = v;
}
}
- 迪米特法则
ThreadLocalMap 是ThreadLocal 的内部类,无访问限制符,只在包在有效。
迪米特法则(Law of Demeter)又叫作最少知识原则(Least Knowledge Principle 简写LKP),就是说一个对象应当对其他对象有尽可能少的了解,不和陌生人说话
。
更多设计原则 https://www.jianshu.com/p/3f7628e2e796 - thread local map 的entry 为weak reference。
图片引自https://www.jianshu.com/p/a1cd61fa22da
通过thread local map 源代码和类图我们总结以下对象引用关系图。
首先在堆栈中的current thread ref,线程一定会在栈中,这个引用是可以确定的。
那么threadlocal ref 一定也在栈中吗?
javap -v com.sparrow.jdk.threadlocal.MultiThreadLocalBusiness
关于load store 指令参见 java 虚拟机规范
https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-2.html#jvms-2.11.2
public void business() {
ThreadLocal<Long> t=this.threadId;
if (t.get() != Thread.currentThread().getId()) {
System.out.println(Thread.currentThread().getId() + "-" + t.get());
}
}
public void business();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=4, locals=1, args_size=1
0: aload_0 //this 压栈,注意这里并不是thread local 引用 而是当前对象
1: getfield #4 // Field threadId:Ljava/lang/ThreadLocal; //取当前字段名
4: invokevirtual #11 // Method java/lang/ThreadLocal.get:()Ljava/lang/Object;//执行get 方法
public void business() {
//改成本地变量后
ThreadLocal<Long> t=this.threadId;
if (threadId.get() != Thread.currentThread().getId()) {
System.out.println(Thread.currentThread().getId() + "-" + threadId.get());
}
}
public void business();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=4, locals=2, args_size=1
0: aload_0
1: getfield #4 // Field threadId:Ljava/lang/ThreadLocal;
4: astore_1
5: aload_1 //将thread local 压栈
通过反汇编可以确定thread local ref 是有可能存在于栈中的。
引用(以下概念摘自《深入理解java 虚拟机》周志明著)
强引用
强引用是指在程序代码中普遍存在的。类似"Object obj=new Object()"这类的引用,只要强引用还存在,垃圾收集器永远不会回收掉被引用的对象。(gc root 可达)软引用
软引用用来描述一些还有用,但并非必需的对象。对于软引用关联着的对象,在系统将要发生溢出异常之前,将会把这些对象列进回收范围之中并进行第二次回收。如果这次回收还是没有足够的的内存,才会抛出内存溢出异常。应用场景 缓存弱引用
弱引用也是用来描述非必需对象的,但是它的强度比软引更弱一些,被弱引用关联的对象只能生成到下一次垃圾收集发生之前。当垃圾收集器工作时,无论当前内存是否足够,都会回收只被弱引用关联的对象。虚引用
通过源码可知thread local map 的key为thread local 对象的弱引用,我们通过代码来验证以上概念的正确性。
package com.sparrow.jdk.threadlocal;
import com.sparrow.jdk.volatilekey.User;
import java.lang.ref.SoftReference;
import java.lang.ref.WeakReference;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
/**
* Created by harry on 2018/4/12.
*/
public class TestWeakReference {
static WeakReference<User> user = new WeakReference<User>(new User(100, new byte[10000]));
public static void main(String[] args) {
int i = 0;
while (true) {
//User u=user.get();
if (user.get() != null) {
i++;
System.out.println("Object is alive for " + i + " loops - ");
} else {
System.out.println("Object has been collected.");
break;
}
//由概念可知无论内存是否足够,只要gc弱引用就会被释放。
System.gc();
}
}
}
运行结果
Object is alive for 1 loops -
Object has been collected.
从结果上看被释放掉了,好象没有什么问题,但我们回想一下,如果thread local 中的key每次在gc的时侯都被释放掉,我们的程序会报空指针异常,而为什么没有空指针异常呢?
package com.sparrow.jdk.threadlocal;
/**
* @author by harry
*/
public class ThreadLocalGc {
//注意这里是threadlocal 非WeakReference
private static ThreadLocal<String> s=new ThreadLocal<>();
public static void main(String[] args) {
s.set("hello");
System.out.println(s.get());
System.gc();
System.out.println(s.get());
}
}
运行结果正常,没有报空指针异常
我们引代码改一下
package com.sparrow.jdk.threadlocal;
import com.sparrow.jdk.volatilekey.User;
import java.lang.ref.SoftReference;
import java.lang.ref.WeakReference;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
/**
* Created by harry on 2018/4/12.
*/
public class TestWeakReference {
static WeakReference<User> user = new WeakReference<User>(new User(100, new byte[10000]));
public static void main(String[] args) {
int i = 0;
while (true) {
//将注释打开,用一个强引用来引用thread local 对象
User u=user.get();
if (user.get() != null) {
i++;
System.out.println("Object is alive for " + i + " loops - ");
} else {
System.out.println("Object has been collected.");
break;
}
System.gc();
}
}
}
运行结果
Object is alive for 1 loops -
Object is alive for 2 loops -
...
...
循环中,说明弱对象没有被回收,所以以上概念不够严谨,应该是没有强引用引用的弱对象
会被gc回收。
thread local 的内存泄漏
由上图首先分析下内存泄漏条件:
- 线程未死亡
因为线程死亡后,thread local map 的引用被切断,而thread local 对象也会被切掉,那么对象一定会被回收,不可能泄漏。 - thread local 的引用被回收,引用变为null。
- key为null 后无get set 操作,因为get set 操作清除掉key 为null的对象。
另外,如果value 对应的对象不是很大,也不是很多的的情况下,内存泄漏并不明显。
如果是大对象则可能引发内存溢出异常(oom),所以建议在不使用该对象时手动调用remove 方法,避免内存泄漏。
package com.sparrow.jdk.threadlocal;
/**
* Created by harry on 2018/4/12.
*/
public class ThreadLocalGCLeak extends Thread {
public static class MyThreadLocal extends ThreadLocal {
private byte[] a = new byte[1024 * 1024 * 1];
@Override
public void finalize() {
System.out.println(" threadlocal 对象被gc回收.");
}
}
public static class MyBigObject {//占用内存的大对象
private byte[] a = new byte[1024 * 1024 * 50];
@Override
public void finalize() {
System.out.println("50 MB对象被gc回收.");
}
}
public static void main(String[] args) throws InterruptedException {
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
ThreadLocal tl = new MyThreadLocal();
tl.set(new MyBigObject());
//tl=null;//断开ThreadLocal引用,暂没想到其他办法让thread local 对象被gc先回收
System.out.println("Full GC 1");
System.gc();
//测试时模拟线程继续执行
//while (true){}
}
});
thread.setDaemon(false);
thread.start();
System.out.println("Full GC 2");
System.gc();
Thread.sleep(1000);
System.out.println("Full GC 3");
System.gc();
Thread.sleep(1000);
System.out.println("Full GC 4");
System.gc();
}
}
/**
* Heuristically scan some cells looking for stale entries.
* This is invoked when either a new element is added, or
* another stale one has been expunged. It performs a
* logarithmic number of scans, as a balance between no
* scanning (fast but retains garbage) and a number of scans
* proportional to number of elements, that would find all
* garbage but would cause some insertions to take O(n) time.
*
* @param i a position known NOT to hold a stale entry. The
* scan starts at the element after i.
*
* @param n scan control: {@code log2(n)} cells are scanned,
* unless a stale entry is found, in which case
* {@code log2(table.length)-1} additional cells are scanned.
* When called from insertions, this parameter is the number
* of elements, but when from replaceStaleEntry, it is the
* table length. (Note: all this could be changed to be either
* more or less aggressive by weighting n instead of just
* using straight log n. But this version is simple, fast, and
* seems to work well.)
*
* @return true if any stale entries have been removed.
*/
private boolean cleanSomeSlots(int i, int n) {
boolean removed = false;
Entry[] tab = table;
int len = tab.length;
do {
i = nextIndex(i, len);
Entry e = tab[i];
if (e != null && e.get() == null) {
n = len;
removed = true;
i = expungeStaleEntry(i);
}
} while ( (n >>>= 1) != 0);
return removed;
}
注意:这里的set 方法的删除逻辑是启发式的,依然会存在内存泄露的风险,所以务必在不使用时进行手动remove
为什么使用弱引用
通过上面的示例,我们会发现,无论是强引用还是弱引用,在不手动remove 的情况下,value 都会泄漏(前提是线程还活着)。
而弱引用至少我们可以保证key会被回收。
因为如果是强引用
threadlocal 有两个强引用
(栈中的和thread locak map 的key)指向它,即使将栈中的切断,设置为null,而thead local map 的key也是强指向它,故thread local 不会被回收,而如果是弱引用,将栈中的强引用切断后,再无强引用引用threadlocal 对象,则在下次gc时会被回收。
http://www.cnblogs.com/onlywujun/p/3524675.html
对thread local 的优化
我们发现thread local map 的本质是hash map,而hash map的时间复杂度为O(1)+O(m) (m<n n为map的size)
所以通过优化thread local map 的时间度为O(1),即将hash map 转换成数组,dubbo 和netty的源码中有对thread local的优化
netty 源代码
io.netty.util.concurrent.FastThreadLocal
private final int index;
private final int cleanerFlagIndex;
public FastThreadLocal() {
index = InternalThreadLocalMap.nextVariableIndex();//初始化index
cleanerFlagIndex = InternalThreadLocalMap.nextVariableIndex();
}
public final V get() {
InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
Object v = threadLocalMap.indexedVariable(index);
if (v != InternalThreadLocalMap.UNSET) {
return (V) v;
}
V value = initialize(threadLocalMap);
registerCleaner(threadLocalMap);
return value;
}
dubbo 源码
org.apache.dubbo.common.threadlocal.InternalThreadLocal
public class InternalThreadLocal<V> {
private static final int variablesToRemoveIndex = InternalThreadLocalMap.nextVariableIndex();
private final int index;
public InternalThreadLocal() {
index = InternalThreadLocalMap.nextVariableIndex();
}
/**
* Returns the current value for the current thread
*/
@SuppressWarnings("unchecked")
public final V get() {
InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
Object v = threadLocalMap.indexedVariable(index);
if (v != InternalThreadLocalMap.UNSET) {
return (V) v;
}
return initialize(threadLocalMap);
}
full gc 风险
Thread Local 为临时变量时,KEY被回收,大量的VALUE无法释放,FULL GC