iOS-OC对象原理_cache_t

前言

本篇文章开始深度探索objc_class结构下的cache_t cache成员，cache_t结构在整个objc底层还是非常重要的，简单的结构分布如下：

拓补图.003.jpeg

开始

创建一个简单的ZZPerson类，同时定义2个实例方法

@interface ZZPerson : NSObject
- (void)toDoSomething0;
- (void)toDoSayHello;
@end

main.m中添加代码片段：

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        ZZPerson *person = [ZZPerson alloc];
        Class pClass = [person class];
        [person toDoSomething0];
        [person toDoSayHello];
        NSLog(@"hello world");
    }
    return 0;
}

通过LLDB调试输出结构体信息：

(lldb) p/x ZZPerson.class
(Class) $0 = 0x0000000100002230 ZZPerson
(lldb) p (cache_t *)0x0000000100002240
(cache_t *) $1 = 0x0000000100002240
(lldb) p *$1
(cache_t) $2 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x0000000100719980 {
      _sel = {
        std::__1::atomic<objc_selector *> = ""
      }
      _imp = {
        std::__1::atomic<unsigned long> = 11504
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 3
  }
  _flags = 32784
  _occupied = 2
}
(lldb) p $2.buckets()
(bucket_t *) $3 = 0x0000000100719980
(lldb) p *$3
(bucket_t) $4 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 11504
  }
}
(lldb) p $4.sel()
(SEL) $5 = "toDoSomething0"
(lldb) p $4.imp(pClass)
(IMP) $6 = 0x0000000100000ec0 (KCObjc`-[ZZPerson toDoSomething0])
(lldb) p $3 + 1
(bucket_t *) $7 = 0x0000000100719990
(lldb) p *$7
(bucket_t) $8 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 11472
  }
}
(lldb) p $8.sel()
(SEL) $9 = "toDoSayHello"
(lldb) p $8.imp(pClass)
(IMP) $10 = 0x0000000100000ee0 (KCObjc`-[ZZPerson toDoSayHello])
(lldb)

针对 _mask和_occupied 探索，从cache_t::insert()开始：

ALWAYS_INLINE
void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
{
#if CONFIG_USE_CACHE_LOCK
    cacheUpdateLock.assertLocked();
#else
    runtimeLock.assertLocked();
#endif

    ASSERT(sel != 0 && cls->isInitialized());
   //1.=========内存分配部分===========
    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = occupied() + 1;
    unsigned oldCapacity = capacity(), capacity = oldCapacity;
    if (slowpath(isConstantEmptyCache())) {
        // Cache is read-only. Replace it.
        if (!capacity) capacity = INIT_CACHE_SIZE;
        reallocate(oldCapacity, capacity, /* freeOld */false);
    }
    else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;  //扩容
        if (capacity > MAX_CACHE_SIZE) {
            capacity = MAX_CACHE_SIZE;
        }
        reallocate(oldCapacity, capacity, true); //重新梳理 扩容
    }
    //2.========插值部分===========
    bucket_t *b = buckets();
    mask_t m = capacity - 1;
    mask_t begin = cache_hash(sel, m);
    mask_t i = begin;

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the
    // minimum size is 4 and we resized at 3/4 full.
    do {
        if (fastpath(b[i].sel() == 0)) {
            incrementOccupied();
            b[i].set<Atomic, Encoded>(sel, imp, cls);
            return;
        }
        if (b[i].sel() == sel) {
            // The entry was added to the cache by some other thread
            // before we grabbed the cacheUpdateLock.
            return;
        }
    } while (fastpath((i = cache_next(i, m)) != begin));

    cache_t::bad_cache(receiver, (SEL)sel, cls);
}

上面的源码我们大致可以分为 内存分配 和 插值处理 2个部分来探索分析：

内存分配部分：这里的开始部分就有一个非常重要的注释信息：Use the cache as-is if it is less than 3/4 full，大致的示意图：

拓补图.001.jpeg

这里有几个枚举值INIT_CACHE_SIZE,MAX_CACHE_SIZE,CACHE_END_MARKER注意下：

#define CACHE_END_MARKER 1
enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2),
    MAX_CACHE_SIZE_LOG2  = 16,
    MAX_CACHE_SIZE       = (1 << MAX_CACHE_SIZE_LOG2),
};

在流程图中我们看到reallocate()方法,直译过来就是重新分配的意思,我们继续看下它的内部实现：

ALWAYS_INLINE
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
    }
}
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED

void cache_t::setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask)
{
    // objc_msgSend uses mask and buckets with no locks.
    // It is safe for objc_msgSend to see new buckets but old mask.
    // (It will get a cache miss but not overrun the buckets' bounds).
    // It is unsafe for objc_msgSend to see old buckets and new mask.
    // Therefore we write new buckets, wait a lot, then write new mask.
    // objc_msgSend reads mask first, then buckets.

#ifdef __arm__
    // ensure other threads see buckets contents before buckets pointer
    mega_barrier();

    _buckets.store(newBuckets, memory_order::memory_order_relaxed);
    
    // ensure other threads see new buckets before new mask
    mega_barrier();
    
    _mask.store(newMask, memory_order::memory_order_relaxed);
    _occupied = 0;
#elif __x86_64__ || i386
    // ensure other threads see buckets contents before buckets pointer
    _buckets.store(newBuckets, memory_order::memory_order_release);
    
    // ensure other threads see new buckets before new mask
    _mask.store(newMask, memory_order::memory_order_release);
    _occupied = 0;
#else
#error Don't know how to do setBucketsAndMask on this architecture.
#endif
}

大致流程：
1.通过buckets(),获取当前oldBuckets;
2.通过allocateBuckets(newCapacity),创建新的newBuckets;
3.通过setBucketsAndMask设置新的newBuckets和newMask,这里我们可以看到newMask = newCapacity - 1,即_mask与_capacity的关系。该方法内部将_occupied = 0。
4.当非首次开辟时,freeOld = true，此时会触发cache_collect_free()方法,将oldBuckets数据抹除掉。
到此，完成了一次重新分配。这意味着当需要扩容重新分配空间时，会将旧数据清空。

插值处理部分：
这里有几个比较重要的方法：

cache_hash(sel,m)

static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    return (mask_t)(uintptr_t)sel & mask;
}

该方法是通过hash运算来获取插值的起始位置begin,这也意味着插值的起始位置是不确定的。

cache_next(i,m):

static inline mask_t cache_next(mask_t i, mask_t mask) {
//    printf("\ni:%d mask:%d cache_next : %d",i,mask,(i+1) & mask);
    return (i+1) & mask;
}
/*
i:1 mask:3 cache_next : 2
i:1 mask:3 cache_next : 2
i:0 mask:3 cache_next : 1
i:3 mask:3 cache_next : 0
i:3 mask:3 cache_next : 0
i:7 mask:7 cache_next : 0
i:0 mask:7 cache_next : 1
*/

这个方式是在do while循环中获取下个索引的方法。

incrementOccupied():

void cache_t::incrementOccupied() 
{
    _occupied++;
}

当执行一次插值时，_occupied++一次；

接下来将整个cache_t结构抽离出一个简单的版本:

struct zz_bucket_t {
    SEL _sel;
    IMP _imp;
};
struct zz_cache_t {
    struct bucket_t * _buckets;
    uint32_t _mask;
};
struct zz_class_data_bits_t {
    uintptr_t bits;
};
struct zz_objc_class {
    Class ISA;
    Class superclass;
    struct zz_cache_t cache;
    struct zz_class_data_bits_t bits;
};

我们在main.m中通过多次调用实例方法，然后看下cache_t的打印情况：

void printCache_tLayout(Class pClass){
    struct zz_objc_class *myClass = (__bridge struct zz_objc_class *)pClass;
    printf("\n========>start<==========");
    printf("\n_occupied:%hu,_mask:%u,capacity:%u ",myClass->cache._occupied,myClass->cache._mask,myClass->cache._mask+1);
//    NSLog(@"_occupied : %hu,_mask:%u",myClass->cache._occupied,myClass->cache._mask);
    for(mask_t i = 0;i< myClass->cache._mask;i++){
        struct zz_bucket_t bucket = myClass->cache._buckets[i];
        printf("\n%s - %p",(char *)(bucket._sel),bucket._imp);
//        NSLog(@"%@ - %p",NSStringFromSelector(bucket._sel),bucket._imp);
    }
    printf("\n=========>end<==========");
}
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        ZZPerson *person = [ZZPerson alloc];
        Class pClass = [person class];
        [person toDoSayHello0];
        printCache_tLayout(pClass);
        [person toDoSayHello1];
        printCache_tLayout(pClass);
        [person toDoSayHello2];
        printCache_tLayout(pClass);
        [person toDoSayHello3];
        printCache_tLayout(pClass);
        [person toDoSayHello4];
        printCache_tLayout(pClass);
        [person toDoHaHaHa];
        printCache_tLayout(pClass);
        [person toDoSomething0];
        printCache_tLayout(pClass);
        [person toDoSayHello0];
        printCache_tLayout(pClass);
    }
    return 0;
}

日志输出：

========>start<==========
_occupied:1,_mask:3,capacity:4 
toDoSayHello0 - 0x2e08
(null) - 0x0
(null) - 0x0
=========>end<==========
========>start<==========
_occupied:2,_mask:3,capacity:4 
toDoSayHello0 - 0x2e08
(null) - 0x0
toDoSayHello1 - 0x2e18
=========>end<==========
========>start<==========
_occupied:1,_mask:7,capacity:8 
toDoSayHello2 - 0x2e28
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
=========>end<==========
========>start<==========
_occupied:2,_mask:7,capacity:8 
toDoSayHello2 - 0x2e28
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
toDoSayHello3 - 0x2e38
=========>end<==========
========>start<==========
_occupied:3,_mask:7,capacity:8 
toDoSayHello2 - 0x2e28
(null) - 0x0
(null) - 0x0
(null) - 0x0
toDoSayHello4 - 0x2ec8
(null) - 0x0
toDoSayHello3 - 0x2e38
=========>end<==========
========>start<==========
_occupied:4,_mask:7,capacity:8 
toDoSayHello2 - 0x2e28
(null) - 0x0
toDoHaHaHa - 0x2e78
(null) - 0x0
toDoSayHello4 - 0x2ec8
(null) - 0x0
toDoSayHello3 - 0x2e38
=========>end<==========
========>start<==========
_occupied:5,_mask:7,capacity:8 
toDoSayHello2 - 0x2e28
(null) - 0x0
toDoHaHaHa - 0x2e78
(null) - 0x0
toDoSayHello4 - 0x2ec8
toDoSomething0 - 0x2e68
toDoSayHello3 - 0x2e38
=========>end<==========
========>start<==========
_occupied:1,_mask:15,capacity:16 
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
toDoSayHello0 - 0x2e08
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
(null) - 0x0
=========>end<==========

现在我们再看日志，就会非常的清晰，从start=>end为每次调用方法后cache_t的变化情况：
第1次：首次分配内存，默认capacity(容量)为4，mask为capacity - 1=3,执行一次插值后_occupied++ = 1；
第2次：进入newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)判断，这里对应的就是2+1<=4/4*3，这里条件满足，所以直接插值，_occupied++ = 2;
第3次：进入newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)判断，这里对应3+1<= 4/4 *3,条件不满足，开始扩容，capacity = 4 * 2 = 8,内存重分配，创建全新的buckets,并抹除oldBuckets，mask = newCapacity - 1 = 7, _occupied = 0重置；执行插值，_occupied++ = 1。
....
....
....
第n次
以此类推...
从日志中同样可以看到每次插入的SEL位置都是不定的，这是由于cache_hash()决定的。

总结

capacity:开辟容量大小，当occupied + 1 + CACHE_END_MARKER 超过capacity的3/4时，开始扩容(capacity * 2),该容量有最大值；
mask: capacity - 1 得到，存在的意义为了防止越界；
occupied: 当开始插值时，occupied++,当重新分配内存后，occupied = 0；
cache_hash()的巧妙使用；

问题

void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)是什么时候调用的呐？
未完待续....

参考：objc_781源码