【netty学习笔记九】FastThreadLocal原理

这篇我们分析下FastThreadLocal的原理。我们知道jdk有自带的ThreadLocal类,那为什么netty要搞个FastThreadLocal,顾名思义,FastThreadLocal相对于ThreadLocal会更快,那么是怎么实现的呢?先看看FastThreadLocal的注释:

/**
// FastThreadLocal比ThreadLocal有更高的访问性能,当在FastThreadLocalThread中使用的时候
 * A special variant of {@link ThreadLocal} that yields higher access performance when accessed from a
 * {@link FastThreadLocalThread}.
 * <p>
 // FastThreadLocal使用数组中的下标来代替用hash方法查找元素,对比hash方法有略微的优势,适用于经常访问的情况
 * Internally, a {@link FastThreadLocal} uses a constant index in an array, instead of using hash code and hash table,
 * to look for a variable.  Although seemingly very subtle, it yields slight performance advantage over using a hash
 * table, and it is useful when accessed frequently.
 * </p><p>
 // 想要利用FastThreadLocal快的优势,线程必须使用FastThreadLocalThread或子类
 * To take advantage of this thread-local variable, your thread must be a {@link FastThreadLocalThread} or its subtype.
 * By default, all threads created by {@link DefaultThreadFactory} are {@link FastThreadLocalThread} due to this reason.
 * </p><p>
 * Note that the fast path is only possible on threads that extend {@link FastThreadLocalThread}, because it requires
 * a special field to store the necessary state.  An access by any other kind of thread falls back to a regular
 * {@link ThreadLocal}.
 * </p>
 *

源码注释上说的比较清楚了,jdk使用ThreadLocalMap来存储ThreadLocal,底层是一个hash结构,key冲突采取线性检测法。而FastThreadLocal底层是一个数组,每个FastThreadLocal对应一个下标,访问起来自然比ThreadLocal快,主要在2个场景:

  1. key较多的情况下,hash+线性检测法访问性能下降;
  2. 经常访问的情况下,数组因为连续存储的优势会被cpu缓存,即访问下标1时,会将下标1及后面几个下标缓存到高性能缓存组件中,下次访问下标2就不用访问相对较慢的内存了。
    值得注意的是,必须在FastThreadLocalThread中才能发挥FastThreadLocal快的优势。下面我们看下FastThreadLocal的实现,首先看例子:
public class FastThreadLocalTest {
    private static FastThreadLocal<String> threadLocal = new FastThreadLocal<>();

    public static void main(String[] args) {
        set();
        System.out.println(get()); 
    }

    private static String get() {
        return threadLocal.get();
    }

    private static void set() {
        threadLocal.set("abc");
    }
}

首先看下set方法:

public final void set(V value) {
        if (value != InternalThreadLocalMap.UNSET) {
            InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
            setKnownNotUnset(threadLocalMap, value))
        } else {
            remove();
        }
    }

如果value != InternalThreadLocalMap.UNSET则先获取InternalThreadLocalMap:

public static InternalThreadLocalMap get() {
        Thread thread = Thread.currentThread();
        if (thread instanceof FastThreadLocalThread) {
            return fastGet((FastThreadLocalThread) thread);
        } else {
            return slowGet();
        }
    }

如果当前线程是FastThreadLocalThread,则fastGet,否则slowGet。slowGet说明会比较慢,也对应了源码注释中说的在FastThreadLocalThread线程下才能发挥快的优势。那我们先看下slowGet:

private static InternalThreadLocalMap slowGet() {
        //这里的slowThreadLocalMap是一个ThreadLocal<InternalThreadLocalMap>
        ThreadLocal<InternalThreadLocalMap> slowThreadLocalMap = UnpaddedInternalThreadLocalMap.slowThreadLocalMap;
        InternalThreadLocalMap ret = slowThreadLocalMap.get();
        if (ret == null) {
            ret = new InternalThreadLocalMap();
            slowThreadLocalMap.set(ret);
        }
        return ret;
    }

static final ThreadLocal<InternalThreadLocalMap> slowThreadLocalMap = new ThreadLocal<InternalThreadLocalMap>();

首先用jdk的ThreadLocal存放InternalThreadLocalMap,然后InternalThreadLocalMap再存放value值。那慢是显而易见的了,首先要访问ThreadLocal拿到InternalThreadLocalMap,然后才能进行其他操作。
再看看fastGet:

private static InternalThreadLocalMap fastGet(FastThreadLocalThread thread) {
        InternalThreadLocalMap threadLocalMap = thread.threadLocalMap();
        if (threadLocalMap == null) {
            thread.setThreadLocalMap(threadLocalMap = new InternalThreadLocalMap());
        }
        return threadLocalMap;
    }

获取FastThreadLocalThread中的threadLocalMap,没有则new一个并初始化。那么再看看InternalThreadLocalMap:

private InternalThreadLocalMap() {
        super(newIndexedVariableTable());
    }

    private static Object[] newIndexedVariableTable() {
        Object[] array = new Object[INDEXED_VARIABLE_TABLE_INITIAL_SIZE];
        Arrays.fill(array, UNSET);
        return array;
    }

InternalThreadLocalMap是FastThreadLocal底层存储结构,不同于ThreadLocalMap使用hash结构,InternalThreadLocalMap直接使用数据,初始化大小为32,全部填满自定义的UNSET对象。
继续看setKnownNotUnset

private void setKnownNotUnset(InternalThreadLocalMap threadLocalMap, V value) {
        if (threadLocalMap.setIndexedVariable(index, value)) {
            addToVariablesToRemove(threadLocalMap, this);
        }
    }

public boolean setIndexedVariable(int index, Object value) {
        Object[] lookup = indexedVariables;
        if (index < lookup.length) {
            Object oldValue = lookup[index];
            lookup[index] = value;
            return oldValue == UNSET;
        } else {
            //扩容
            expandIndexedVariableTableAndSet(index, value);
            return true;
        }
    }
private void expandIndexedVariableTableAndSet(int index, Object value) {
        //扩容为原来2倍,并且保证是2的n次方(和hashmap扩容一样)
        Object[] oldArray = indexedVariables;
        final int oldCapacity = oldArray.length;
        int newCapacity = index;
        newCapacity |= newCapacity >>>  1;
        newCapacity |= newCapacity >>>  2;
        newCapacity |= newCapacity >>>  4;
        newCapacity |= newCapacity >>>  8;
        newCapacity |= newCapacity >>> 16;
        newCapacity ++;
        //将以前的数组元素复制到新数组
        Object[] newArray = Arrays.copyOf(oldArray, newCapacity);
        Arrays.fill(newArray, oldCapacity, newArray.length, UNSET);
        newArray[index] = value;
        indexedVariables = newArray;
    }

先看setIndexedVariable操作,首先获取数组,如果待访问的下标index大于数组长度,那么就扩容并插入新值,否则直接插入新值。
然后再看下addToVariablesToRemove(当插入而非更新时,setIndexedVariable方法会返回true):

private static void addToVariablesToRemove(InternalThreadLocalMap threadLocalMap, FastThreadLocal<?> variable) {
        Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex);
        Set<FastThreadLocal<?>> variablesToRemove;
        if (v == InternalThreadLocalMap.UNSET || v == null) {
            //创建IdentityHashMap并放入InternalThreadLocalMap的下标为0处
            variablesToRemove = Collections.newSetFromMap(new IdentityHashMap<FastThreadLocal<?>, Boolean>());
            threadLocalMap.setIndexedVariable(variablesToRemoveIndex, variablesToRemove);
        } else {
            variablesToRemove = (Set<FastThreadLocal<?>>) v;
        }
        //将FastThreadLocal放入variablesToRemove
        variablesToRemove.add(variable);
    }

这里将FastThreadLocal放入variablesToRemove(Set集合),当需要remove时可以快速移除,参考removeAll方法:

removeAll会在FastThreadLocalThread线程执行完毕时执行
public static void removeAll() {
        InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.getIfSet();
        if (threadLocalMap == null) {
            return;
        }

        try {
            Object v = threadLocalMap.indexedVariable(variablesToRemoveIndex);
            if (v != null && v != InternalThreadLocalMap.UNSET) {
                @SuppressWarnings("unchecked")
                Set<FastThreadLocal<?>> variablesToRemove = (Set<FastThreadLocal<?>>) v;
                FastThreadLocal<?>[] variablesToRemoveArray =
                        variablesToRemove.toArray(new FastThreadLocal[0]);
                for (FastThreadLocal<?> tlv: variablesToRemoveArray) {
                    tlv.remove(threadLocalMap);
                }
            }
        } finally {
            InternalThreadLocalMap.remove();
        }
    }

到这里FastThreadLocal的set方法就讲完了,值得一提的是低版本的FastThreadLocal还有个ObjectCleaner来解决非FastThreadLocalThread线程使用了jdk版本的ThreadLocal所带来的内存泄露问题,不过高版本已经删除了这段逻辑,原因见这里:https://github.com/netty/netty/commit/5b1fe611a637c362a60b391079fff73b1a4ef912#diff-e0eb4e9a6ea15564e4ddd076c55978de,这里就不多说了。
继续看get方法:

public final V get() {
        InternalThreadLocalMap threadLocalMap = InternalThreadLocalMap.get();
        Object v = threadLocalMap.indexedVariable(index);
        if (v != InternalThreadLocalMap.UNSET) {
            return (V) v;
        }

        return initialize(threadLocalMap);
    }
public Object indexedVariable(int index) {
        Object[] lookup = indexedVariables;
        return index < lookup.length? lookup[index] : UNSET;
    }

如果通过下标在数组中找到了值,则直接返回。否则初始化个null并返回null。

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。