Guava cache源码解析之构造缓存器

guava cache使用介绍

guava cache是目前最长用的本地缓存，可以把它看成是增加了一些额外功能的ConcurrentHashMap，也是线程安全的本地缓存。

使用示例

 @Test
    public void testLoad() throws Exception{
        LoadingCache<String,Object> loadingCache= CacheBuilder.newBuilder()
                .maximumSize(100)
                .expireAfterWrite(10, TimeUnit.MINUTES)
                .removalListener(notification -> {
                    System.out.println(notification.getValue() +
                            " is removed at:"+System.currentTimeMillis()/1000);
                })
                .build(new CacheLoader<String, Object>() {
            @Override
            public Object load(String key) throws Exception {
                System.out.println("load value for key:"+key);
                return key+"";
            }
        });
        //第一次查询会执行cache的load方法,第二次直接从缓存获取
        System.out.println(loadingCache.get("key1"));
        System.out.println(loadingCache.get("key1"));
    }

可以看到缓存器的构建没有使用构造器而是构建器模式，构建器模式是存在多个可选参数时最合适的一种配置参数的方式。接下来我们具体来看看缓存器到底是怎么完成构建的。

源码分析

在上面构造缓存器的第一步是构造CacheBuilder，我们来看看CacheBuilder的一些属性：

private static final int DEFAULT_INITIAL_CAPACITY = 16;//用于计算每个Segment中的hashtable的大小
    private static final int DEFAULT_CONCURRENCY_LEVEL = 4;//用于计算有几个Segment
    private static final int DEFAULT_EXPIRATION_NANOS = 0;//默认的缓存过期时间
    
    static final int UNSET_INT = -1;
    
    int initialCapacity = UNSET_INT;//用于计算每个Segment中的hashtable的大小
    int concurrencyLevel = UNSET_INT;//用于计算有几个Segment
    long maximumSize = UNSET_INT;//cache中最多能存放的缓存entry个数
    long maximumWeight = UNSET_INT;
    
    Strength keyStrength;//键的引用类型（strong、weak、soft）
    Strength valueStrength;//值的引用类型（strong、weak、soft）

    long expireAfterWriteNanos = UNSET_INT;//缓存超时时间（起点：缓存被创建或被修改）
    long expireAfterAccessNanos = UNSET_INT;//缓存超时时间（起点：缓存被创建或被修改或被访问）

CacheBuilder-->newCacheBuilder()：创建一个CacheBuilder实例：

/**
     * 采用默认的设置（如下）创造一个新的CacheBuilder实例
     * 1、strong keys
     * 2、strong values
     * 3、no automatic eviction of any kind.
     */
    public static CacheBuilder<Object, Object> newBuilder() {
        return new CacheBuilder<Object, Object>();//new 一个实例
    }

接下里使用build模式指定一些属性值比如cache中最多能放置的entry个数：maximumSize,和超时时间： expireAfterWriteNanos。

CacheBuilder-->maximumSize(long size)

/**
     * 指定cache中最多能存放的entry（key-value）个数maximumSize
     * 注意：
     * 1、在entry个数还未达到这个指定个数maximumSize的时候，可能就会发生缓存回收
     * 上边这种情况发生在cache size接近指定个数maximumSize，
     * cache就会回收那些很少会再被用到的缓存（这些缓存会使最近没有被用到或很少用到的），其实说白了就是LRU算法回收缓存
     * 2、maximumSize与maximumWeight不能一起使用，其实后者也很少会使用
     */
    public CacheBuilder<K, V> maximumSize(long size) {
        /* 检查maximumSize是否已经被设置过了 */
        checkState(this.maximumSize == UNSET_INT,
                   "maximum size was already set to %s", 
                   this.maximumSize);
        /* 检查maximumWeight是否已经被设置过了（这就是上边说的第二条）*/
        checkState(this.maximumWeight == UNSET_INT,
                   "maximum weight was already set to %s", 
                   this.maximumWeight);
        /* 这是与maximumWeight配合的一个属性 */
        checkState(this.weigher == null,
                   "maximum size can not be combined with weigher");
        /* 检查设置的maximumSize是不是>=0，通常不会设置为0，否则不会起到缓存作用 */
        checkArgument(size >= 0, "maximum size must not be negative");
        this.maximumSize = size;
        return this;
    }

注意这里是设置整个cache对多可存放的entry个数，而不是每个segment。

CacheBuilder-->expireAfterWrite(long duration, TimeUnit unit)

/**
     * 指明每一个entry（key-value）在缓存中的过期时间
     * 1、时间的参考起点：entry的创建或值的修改
     * 2、过期的entry也许会被计入缓存个数size（也就是说缓存个数不仅仅只有存活的entry）
     * 3、但是过期的entry永远不会被读写
     */
    public CacheBuilder<K, V> expireAfterWrite(long duration, TimeUnit unit) {
        /*
         * 检查之前是否已经设置过缓存超时时间
         */
        checkState(expireAfterWriteNanos == UNSET_INT,//正确条件：之前没有设置过缓存超时时间
                   "expireAfterWrite was already set to %s ns",//不符合正确条件的错误信息
                   expireAfterWriteNanos);
        /*
         * 检查设置的超时时间是否大于等于0，当然，通常情况下，我们不会设置缓存为0
         */
        checkArgument(duration >= 0, //正确条件
                      "duration cannot be negative: %s %s",//不符合正确条件的错误信息，下边的是错误信息中的错误参数
                      duration, 
                      unit);
        this.expireAfterWriteNanos = unit.toNanos(duration);//根据输入的时间值与时间单位，将时间值转换为纳秒
        return this;
    }

说明：

设置超时时间，注意时间的起点是entry的创建或修改
expireAfterAccess(long duration, TimeUnit unit)方法的时间起点：entry的创建、修改、被访问

CacheBuilder-->build(CacheLoader<? super K1, V1> loader)

/**
     * 建立一个cache，该缓存器通过使用传入的CacheLoader，
     * 既可以获取已给定key的value，也能够自动的计算和获取缓存
     * 当然，这里是线程安全的，线程安全的运行方式与ConcurrentHashMap一致
     */
    public <K1 extends K, V1 extends V> LoadingCache<K1, V1> build(CacheLoader<? super K1, V1> loader) {
        checkWeightWithWeigher();
        return new LocalCache.LocalLoadingCache<K1, V1>(this, loader);
    }

说明：

该方法返回一个LoadingCache接口的实现类LocalLoadingCache实例，
在build方法需要传入一个CacheLoader的实例，实际使用中使用了匿名内部类来实现的

LocalLoadingCache构造器

static class LocalLoadingCache<K, V> extends LocalManualCache<K, V>
                                         implements LoadingCache<K, V> {

        LocalLoadingCache(CacheBuilder<? super K, ? super V> builder,
                          CacheLoader<? super K, V> loader) {
            super(new LocalCache<K, V>(builder, checkNotNull(loader)));
        }

说明：

在该静态内部类中，要保证传入的CacheLoader实例非空
然后在构造器里创建了一个LocalCache实例出来，
最后调用父类LocalManualCache的私有构造器将第二步创建出来的LocalCache实例赋给LocalCache的类变量

重点是LocalCache的创建，我们先看下LocalCache的属性：

/** 最大容量（2的30次方），即最多可存放2的30次方个entry（key-value） */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /** 最多多少个Segment（2的16次方）*/
    static final int MAX_SEGMENTS = 1 << 16;

    /** 用于选择Segment */
    final int segmentMask;

    /** 用于选择Segment，尽量将hash打散 */
    final int segmentShift;

    /** 底层数据结构，就是一个Segment数组，而每一个Segment就是一个hashtable */
    final Segment<K, V>[] segments;

    /** 
     * 并发水平，这是一个用于计算Segment个数的一个数，
     * Segment个数是一个刚刚大于或等于concurrencyLevel的数
     */
    final int concurrencyLevel;

    /** 键的引用类型（strong、weak、soft） */
    final Strength keyStrength;

    /** 值的引用类型（strong、weak、soft） */
    final Strength valueStrength;

    /** The maximum weight of this map. UNSET_INT if there is no maximum. 
     * 如果没有设置，就是-1
     */
    final long maxWeight;

    final long expireAfterAccessNanos;

    final long expireAfterWriteNanos;

    /** Factory used to create new entries. */
    final EntryFactory entryFactory;

    /** 默认的缓存加载器，用于做一些缓存加载操作*/
    @Nullable
    final CacheLoader<? super K, V> defaultLoader;

LocalCache-->LocalCache(CacheBuilder, CacheLoader)

/**
     * 创建一个新的、空的map（并且指定策略、初始化容量和并发水平）
     */
    LocalCache(CacheBuilder<? super K, ? super V> builder,
               @Nullable CacheLoader<? super K, V> loader) {
        /*
         * 默认并发水平是4，即四个Segment（但要注意concurrencyLevel不一定等于Segment个数）
         * Segment个数：一个刚刚大于或等于concurrencyLevel且是2的几次方的一个数
         */
        concurrencyLevel = Math
                .min(builder.getConcurrencyLevel(), MAX_SEGMENTS);

        keyStrength = builder.getKeyStrength();//默认为Strong，即强引用
        valueStrength = builder.getValueStrength();//默认为Strong，即强引用

        // 缓存超时(时间起点：entry的创建或替换（即修改）)
        expireAfterWriteNanos = builder.getExpireAfterWriteNanos();
        // 缓存超时(时间起点：entry的创建或替换（即修改）或最后一次访问)
        expireAfterAccessNanos = builder.getExpireAfterAccessNanos();
        //创建entry的工厂
        entryFactory = EntryFactory.getFactory(keyStrength,
                                                  usesAccessEntries(), 
                                                  usesWriteEntries());
        //默认的缓存加载器
        defaultLoader = loader;

        // 初始化容量为16，整个cache可以放16个缓存entry
        int initialCapacity = Math.min(builder.getInitialCapacity(),
                                       MAXIMUM_CAPACITY);

        int segmentShift = 0;
        int segmentCount = 1;
        //循环条件的&&后边的内容是关于weight的，由于没有设置maxWeight，所以其值为-1-->evictsBySize()返回false
        while (segmentCount < concurrencyLevel
                && (!evictsBySize() || segmentCount * 20 <= maxWeight)) {
            ++segmentShift;
            segmentCount <<= 1;//找一个刚刚大于或等于concurrencyLevel的Segment数
        }
        this.segmentShift = 32 - segmentShift;
        segmentMask = segmentCount - 1;

        this.segments = newSegmentArray(segmentCount);//创建指定大小的数组

        int segmentCapacity = initialCapacity / segmentCount;//计算每一个Segment中的容量的值，刚刚大于等于initialCapacity/segmentCount
        if (segmentCapacity * segmentCount < initialCapacity) {
            ++segmentCapacity;
        }

        int segmentSize = 1;//每一个Segment的容量
        while (segmentSize < segmentCapacity) {
            segmentSize <<= 1;//刚刚>=segmentCapacity&&是2的几次方的数
        }

        if (evictsBySize()) {//由于没有设置maxWeight，所以其值为-1-->evictsBySize()返回false
            // Ensure sum of segment max weights = overall max weights
            long maxSegmentWeight = maxWeight / segmentCount + 1;
            long remainder = maxWeight % segmentCount;
            for (int i = 0; i < this.segments.length; ++i) {
                if (i == remainder) {
                    maxSegmentWeight--;
                }
                this.segments[i] = createSegment(segmentSize, 
                                                 maxSegmentWeight,
                                                 builder.getStatsCounterSupplier().get());
            }
        } else {
            for (int i = 0; i < this.segments.length; ++i) {
                this.segments[i] = createSegment(segmentSize, 
                                                 UNSET_INT,
                                                 builder.getStatsCounterSupplier().get());
            }
        }
    }

CacheBuilder-->getConcurrencyLevel()

int getConcurrencyLevel() {
        return (concurrencyLevel == UNSET_INT) ? //是否设置了concurrencyLevel
                DEFAULT_CONCURRENCY_LEVEL//如果没有设置，采用默认值4
                : concurrencyLevel;//如果设置了，采用设置的值
    }

CacheBuilder-->getKeyStrength()

//获取键key的强度（默认为Strong，还有weak和soft）
    Strength getKeyStrength() {
        return MoreObjects.firstNonNull(keyStrength, Strength.STRONG);
    }

public static <T> T firstNonNull(@Nullable T first, @Nullable T second) {
    return first != null ? first : checkNotNull(second);
  }

CacheBuilder-->getExpireAfterWriteNanos()

long getExpireAfterWriteNanos() {
        return (expireAfterWriteNanos == UNSET_INT) ? 
                DEFAULT_EXPIRATION_NANOS
                : expireAfterWriteNanos;
    }

说明： 获取超时时间，如果设置了，就是设置值，如果没设置，默认是0

CacheBuilder-->getInitialCapacity()

int getInitialCapacity() {
        return (initialCapacity == UNSET_INT) ? 
                DEFAULT_INITIAL_CAPACITY
                : initialCapacity;
    }

说明： 获取初始化容量，如果指定了就是用指定容量，如果没指定，默认为16。值得注意的是，该容量是用于计算整个cache的容量的，并不是每个Segment的容量

EntryFactory-->getFatory()

/**
         * Masks used to compute indices in the following table.
         */
        static final int ACCESS_MASK = 1;
        static final int WRITE_MASK = 2;
        static final int WEAK_MASK = 4;

        /**
         * Look-up table for factories.
         */
        static final EntryFactory[] factories = { STRONG, STRONG_ACCESS,
                STRONG_WRITE, STRONG_ACCESS_WRITE, WEAK, WEAK_ACCESS,
                WEAK_WRITE, WEAK_ACCESS_WRITE, };

        static EntryFactory getFactory(Strength keyStrength,
                                       boolean usesAccessQueue, 
                                       boolean usesWriteQueue) {
            int flags = ((keyStrength == Strength.WEAK) ? WEAK_MASK : 0)//0
                    | (usesAccessQueue ? ACCESS_MASK : 0)//0
                    | (usesWriteQueue ? WRITE_MASK : 0);//WRITE_MASK-->2
            return factories[flags];//STRONG_WRITE
        }

EntryFactory是LocalCache的一个内部枚举类，通过上述方法，获取除了相应的EntryFactory，这里选出的是STRONG_WRITE工厂，该工厂代码如下：

STRONG_WRITE {
            /**
             * 创建新的Entry
             */
            @Override
            <K, V> ReferenceEntry<K, V> newEntry(Segment<K, V> segment, 
                                                 K key,
                                                 int hash, 
                                                 @Nullable ReferenceEntry<K, V> next) {
                return new StrongWriteEntry<K, V>(key, hash, next);
            }

            /**
             * 将原来的Entry（original）拷贝到当下的Entry（newNext）
             */
            @Override
            <K, V> ReferenceEntry<K, V> copyEntry(Segment<K, V> segment,
                                                  ReferenceEntry<K, V> original, 
                                                  ReferenceEntry<K, V> newNext) {
                ReferenceEntry<K, V> newEntry = super.copyEntry(segment,
                        original, newNext);
                copyWriteEntry(original, newEntry);
                return newEntry;
            }
        }

在该工厂中，指定了创建新entry的方法与复制原有entry为另一个entry的方法。

LocalCache-->newSegmentArray(int ssize)

/**
     * 创建一个指定大小的Segment数组
     */
    @SuppressWarnings("unchecked")
    final Segment<K, V>[] newSegmentArray(int ssize) {
        return new Segment[ssize];
    }

LocalCache-->createSegment(initialCapacity,maxSegmentWeight,StatsCounter)

Segment<K, V> createSegment(int initialCapacity, 
                                long maxSegmentWeight,
                                StatsCounter statsCounter) {
        return new Segment<K, V>(this, 
                                 initialCapacity, 
                                 maxSegmentWeight,
                                 statsCounter);
    }

该方法用于为之前创建的Segment数组的每一个元素赋值。下面列出Segment类的一些属性和方法

final LocalCache<K, V> map;// 外部类的一个实例

        /** 该Segment中已经存在缓存的个数  */
        volatile int count;

        /**
         * 指定是下边的AtomicReferenceArray<ReferenceEntry<K, V>> table，即扩容也是只扩自己的Segment
         * The table is expanded when its size exceeds this threshold. (The
         * value of this field is always {@code (int) (capacity * 0.75)}.)
         */
        int threshold;

        /**
         * 每个Segment中的table
         */
        volatile AtomicReferenceArray<ReferenceEntry<K, V>> table;

        /**
         * The maximum weight of this segment. UNSET_INT if there is no maximum.
         */
        final long maxSegmentWeight;

        /**
         * map中当前元素的一个队列，队列元素根据write time进行排序，每write一个元素就将该元素加在队列尾部
         */
        @GuardedBy("this")
        final Queue<ReferenceEntry<K, V>> writeQueue;

        /**
         * A queue of elements currently in the map, ordered by access time.
         * Elements are added to the tail of the queue on access (note that
         * writes count as accesses).
         */
        @GuardedBy("this")
        final Queue<ReferenceEntry<K, V>> accessQueue;

        Segment(LocalCache<K, V> map, int initialCapacity,
                long maxSegmentWeight, StatsCounter statsCounter) {
            this.map = map;
            this.maxSegmentWeight = maxSegmentWeight;//0
            this.statsCounter = checkNotNull(statsCounter);
            initTable(newEntryArray(initialCapacity));

            writeQueue = map.usesWriteQueue() ? //过期时间>0
                         new WriteQueue<K, V>() //WriteQueue
                         : LocalCache.<ReferenceEntry<K, V>> discardingQueue();

            accessQueue = map.usesAccessQueue() ? //false
                          new AccessQueue<K, V>()
                          : LocalCache.<ReferenceEntry<K, V>> discardingQueue();
        }

        AtomicReferenceArray<ReferenceEntry<K, V>> newEntryArray(int size) {
            return new AtomicReferenceArray<ReferenceEntry<K, V>>(size);//new Object[size];
        }

        void initTable(AtomicReferenceArray<ReferenceEntry<K, V>> newTable) {
            this.threshold = newTable.length() * 3 / 4; // 0.75
            if (!map.customWeigher() && this.threshold == maxSegmentWeight) {
                // prevent spurious expansion before eviction
                this.threshold++;
            }
            this.table = newTable;
        }

Segment的构造器完成了三件事儿：为变量赋值 + 初始化Segment的table + 构建相关队列

initTable(newEntryArray(initialCapacity))源代码在Segment类中已给出：初始化table的步骤简述为：创建一个指定个数的ReferenceEntry数组，计算扩容值。
初始化table的步骤简述为：创建一个指定个数的ReferenceEntry数组，

到目前为止，guava cache的完整的一个数据结构基本上就建立起来了,最后来总结一下：

image.png

说明：

其结构与ConcurrentHashMap很类似，ReferenceEntry[i]用于存储key-value
解决hash冲突是采用拉链法，
各个segment互不相关，每个Segment只需要扩容自己就好
每个segment的有效队列个数最多可能不止一个
队列用于实现LRU缓存回收算法

guava cache的数据结构的构建流程：

构建CacheBuilder实例cacheBuilder
cacheBuilder实例指定缓存器LocalCache的初始化参数
cacheBuilder实例使用build()方法创建LocalCache实例（简单说成这样，实际上复杂一些）
1. 首先为各个类变量赋值（通过第二步中cacheBuilder指定的初始化参数以及原本就定义好的一堆常量）
2. 之后创建Segment数组
3. 最后初始化每一个Segment[i]
  - 为Segment属性赋值
  - 初始化Segment中的table，即一个ReferenceEntry数组（每一个key-value就是一个ReferenceEntry）
  - 根据之前类变量的赋值情况，创建相应队列，用于LRU缓存回收算法

最后我们举个简单的例子来复习一下，指定：expireAfterWriteNanos==20min maximumSize==1000

默认值：
concurrency_level==4（用于计算Segment个数） initial_capcity==16 （用于计算每个Segment容量）
keyStrength==STRONG 　　valueStrength==STRONG

计算出：

entryFactory==STRONG_WRITE

segmentCount==4：Segment个数，一个刚刚大于等于concurrency_level且是2的几次方的一个数

segmentCapacity==initial_capcity/segmentCount==4：用来计算每个Segment能放置的entry个数的一个值，一个刚刚等于initial_capcity/segmentCount或者比initial_capcity/segmentCount大1的数（关键看是否除尽）

segmentSize==4：每个Segment能放置的entry个数，刚刚>=segmentCapacity&&是2的几次方的数

segments==Segment[segmentCount]==Segment[4]
segments[i]：

包含一个ReferenceEntry[segmentSize]==ReferenceEntry[4]
WriteQueue：用于LRU算法的队列
threshold==newTable.length()*3/4==segmentSize*3/4==3：每个Segment中有了3个Entry（key-value），就会扩容