iOS 底层学习16

前言

iOS 底层第16天的学习。接着 15天学习的内容。分析 category 是如何加载到 class 里的。

category 探索

我们回到 realizeClassWithoutSwift

static Class realizeClassWithoutSwift(Class cls, Class previously) {
    // ...
    // Attach categories
    methodizeClass(cls, previously);
    // ...
}

进入 methodizeClass

static void methodizeClass(Class cls, Class previously)
{
    runtimeLock.assertLocked();

    bool isMeta = cls->isMetaClass();
    auto rw = cls->data();
    auto ro = rw->ro();
    auto rwe = rw->ext();
 
   // ... 
    // Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls), nullptr);
        if (rwe) rwe->methods.attachLists(&list, 1); 
    }
  // ....

if (rwe) 有值 rwe 就会 methods.attachLists ，进去反推 rwe = rw->ext(),那 ext() 在哪里实现的？
进入 ext()

   class_rw_ext_t *ext() const {
        return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(&ro_or_rw_ext);
   }

搜索class_rw_ext_t

  class_rw_ext_t *extAllocIfNeeded() {
        auto v = get_ro_or_rwe();
        if (fastpath(v.is<class_rw_ext_t *>())) {
            return v.get<class_rw_ext_t *>(&ro_or_rw_ext);
        } else {
            return extAlloc(v.get<const class_ro_t *>(&ro_or_rw_ext));
        }
   }

找到 extAllocIfNeeded ，全局搜索 extAllocIfNeeded

attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{
    // ...
    uint32_t mcount = 0;
    uint32_t propcount = 0;
    uint32_t protocount = 0;
    bool fromBundle = NO;
    bool isMeta = (flags & ATTACH_METACLASS);
    auto rwe = cls->data()->extAllocIfNeeded();
    // ...

}

发现在 attachCategories 里有对 extAllocIfNeeded 进行调用
全局搜索 attachCategories ，发现有2个地方调用了 attachCategories
第一处： attachToClass

 void attachToClass(Class cls, Class previously, int flags)
    {
       // ...
        auto &map = get();
        auto it = map.find(previously);

        if (it != map.end()) {
            category_list &list = it->second;
            if (flags & ATTACH_CLASS_AND_METACLASS) {
                int otherFlags = flags & ~ATTACH_CLASS_AND_METACLASS;
                attachCategories(cls, list.array(), list.count(), otherFlags | ATTACH_CLASS);
                attachCategories(cls->ISA(), list.array(), list.count(), otherFlags | ATTACH_METACLASS);
            } else {
                attachCategories(cls, list.array(), list.count(), flags);
            }
            map.erase(it);
        }
    }

第二处： load_categories_nolock

static void load_categories_nolock(header_info *hi) {
    size_t count;
     auto processCatlist = [&](category_t * const *catlist) {
         for (unsigned i = 0; i < count; i++) {
            category_t *cat = catlist[I];
            Class cls = remapClass(cat->cls);
            locstamped_category_t lc{cat, hi};
                   // ...
                   if (cat->classMethods  ||  cat->protocols
                    ||  (hasClassProperties && cat->_classProperties))
                    {
                    if (cls->ISA()->isRealized()) {
                      // 调用了
                        attachCategories(cls->ISA(), &lc, 1, ATTACH_EXISTING | ATTACH_METACLASS);
                    } else {
                        objc::unattachedCategories.addForClass(lc, cls->ISA());
                    }
                }
            }
     }
    processCatlist(hi->catlist(&count));
    processCatlist(hi->catlist2(&count));
}

先从 attachToClass 进行推导，全局搜索 attachToClass

static void methodizeClass(Class cls, Class previously)
{
 bool isMeta = cls->isMetaClass();
    auto rw = cls->data();
    auto ro = rw->ro();
    auto rwe = rw->ext();
    // Install methods and properties that the class implements itself.
    method_list_t *list = ro->baseMethods();
    if (list) {
        prepareMethodLists(cls, &list, 1, YES, isBundleClass(cls), nullptr);
        if (rwe) rwe->methods.attachLists(&list, 1);
    }
  // ... 省略部分代码
   objc::unattachedCategories.attachToClass(cls, cls,
                                             isMeta ? ATTACH_METACLASS : ATTACH_CLASS);

}

最后发现又回到了 methodizeClass
正向流程应该是： realizeClassWithoutSwift -> methodizeClass-> attachToClass -> attachCategories
然后这个 attachCategories 内部到底做了哪些事情呢？我们继续进行分析

attachCategories

进入 attachCategories

attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{
    // ...
    uint32_t mcount = 0;
    uint32_t propcount = 0;
    uint32_t protocount = 0;
    bool fromBundle = NO;
    bool isMeta = (flags & ATTACH_METACLASS);
    auto rwe = cls->data()->extAllocIfNeeded();
   // ...
   if (mcount > 0) {
        // 方法的排序
        prepareMethodLists(cls, mlists + ATTACH_BUFSIZ - mcount, mcount,
                           NO, fromBundle, __func__);
        //  方法的添加
        rwe->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount);
        if (flags & ATTACH_EXISTING) {
            flushCaches(cls, __func__, [](Class c){
                // constant caches have been dealt with in prepareMethodLists
                // if the class still is constant here, it's fine to keep
                return !c->cache.isConstantOptimizedCache();
            });
        }
    }
// . ..

}

进入 methods.attachLists 进行分析

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            array_t *newArray = (array_t *)malloc(array_t::byteSize(newCount));
            newArray->count = newCount;
            array()->count = newCount;

            for (int i = oldCount - 1; i >= 0; I--)
                newArray->lists[i + addedCount] = array()->lists[I];
            for (unsigned i = 0; i < addedCount; I++)
                newArray->lists[i] = addedLists[I];
            free(array());
            setArray(newArray);
            validate();
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
            validate();
        } 
        else {
            // 1 list -> many lists
            Ptr<List> oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            for (unsigned i = 0; i < addedCount; I++)
                array()->lists[i] = addedLists[I];
            validate();
        }
    }

我们先从最短的开始分析

  // 0 lists -> 1 list
  list = addedLists[0];
  validate();

当新添加只有1个并且lists 无值的时候，list = addedList
接着从 else 开始分析

    // 1 list -> many lists
   Ptr<List> oldList = list; 
   uint32_t oldCount = oldList ? 1 : 0; //  oldList 有数据 => oldCount = 1
   uint32_t newCount = oldCount + addedCount; // 假设 addedCount =2 ，newCount = 1+2 = 3
   setArray((array_t *)malloc(array_t::byteSize(newCount))); // 创建一个新的array
   array()->count = newCount; //  新的 array 空间大小 = 3
   if (oldList) array()->lists[addedCount] = oldList; // 把 oldlist 放到 lists[2] 位置里
   for (unsigned i = 0; i < addedCount; I++)
        array()->lists[i] = addedLists[I]; // 循环从第0位置开始放入到lists，放入个数为addedCount = 2
   validate();

总结一下就是把 oldlists 这个整体 放到 newlists 最后面，把 added 数据 放到 newlsit 的最前面
接着继续添加就会进入 if (hasArray()),开始分析if (hasArray())

// many lists -> many lists
    uint32_t oldCount = array()->count;
    uint32_t newCount = oldCount + addedCount;
    array_t *newArray = (array_t *)malloc(array_t::byteSize(newCount)); // 开辟一个新的内存空间
    newArray->count = newCount;
    array()->count = newCount;

    for (int i = oldCount - 1; i >= 0; i--) // 倒序 ，假设oldCount = 3 ，addedCount = 1, i = 2
        newArray->lists[i + addedCount] = array()->lists[i]; //第一次循环，newlists[2 + 1 = 3] =， lists[2]  ，就是把list最后元素 加到  newlist 的最后面
    for (unsigned i = 0; i < addedCount; I++)
        newArray->lists[i] = addedLists[I]; //  把  added  数据加到 newlist 的前面
    free(array());  // 情况数据重新开始
    setArray(newArray);
    validate();

总结一下就是把 lists 数据 倒序依次放到 newlists 最后面，把added 数据 放到 newlists 的最前面
ps:图画的有点丑请见谅

动态调试分析 `attachCategories`

分析前准备新建一个 XKStudent (XKA) 类目，主类和分类同时调用 load 方法

@implementation XKStudent

- (void) doSomething {}
- (void) doSomething2 {}

+ (void) load {
    NSLog(@"加载了 %s",__func__);
}

@end
// 分类
@implementation XKStudent (XKA)

+ (void) load {
    NSLog(@"加载了 %s",__func__);
}
- (void) doSomethingByCate {}
- (void) doSomething2ByCate {}

@end

打印流程如下

这时我们发现当主类和分类同时调用 load 方法时，会调用 load_categories_nolock
开始分析，在哪里调用了 load_categories_nolock

查看堆栈信息，得知在 load_images -> loadAllCategories -> load_categories_nolock
最终来到了 attachCategories
这时得出一个结论就是： 类与分类 同时调用 load方法
从 _read_image 开始调用 realizeClassWithoutSwift -> methodizeClass-> attachToClass 最终没有进入 attachCategories
而是在 load_images -> loadAllCategories -> load_categories_nolock -> attachCategories
探索到这里也就能解释了之前为何 attachCategories 会有2个地方(1.attachToClass,2: load_categories_nolock)调用的原因了。
进入 load_categories_nolock

打印输出 cat 验证 cat 是我们这次要分析 Student(XKA)

继续往下分析

根据判断条件进入 attachCategories,来到 👇

分析代码 mlists[ATTACH_BUFSIZ - ++mcount] = mlist ，输出打印结果

把 mlist 放到了 mlists 的第 63 个位置
继续往下分析，来到👇

mlists + ATTACH_BUFSIZ - mcount 这个是？，我们打印输出一下

这下我们就能得知原来· attachLists 里存放的 指针的指针
接下我们动态调试再次进入了 attachLists

这时我们已得知 addedLists 是一个 二级指针，·addedCount = 1
继续往下分析来到了 else 👇

这里的 list 应该是 主类 - XKStudent ，打印验证下

而这里的 array()->lists[addedCount] 存的是什么呢？

打印 array()->lists[addedCount]

打印 addedList

由此得知 array()->lists[i] 里其实就是存了一个数组指针，而指针指向的地址就是 methods_list_t, 由此就能得知 array()->lists[0] 指针指向就是一个category:methods_list 👇

已知在 attachLists 方法里还有一个if (hasArray()) {}，那何时会动态分析进入呢？
猜想如果定义多个 category 是不是就会进入呢？

分析前准备新建3个分类

//  XKA
@interface XKStudent (XKA)

- (void) doSomethingByCateA;
- (void) doSomething2ByCateA;

@end
//  XKB
@interface XKStudent (XKB)

- (void) doSomethingByCateB;
- (void) doSomething2ByCateB;

@end
//  XKC
@interface XKStudent (XKC)

- (void) doSomethingByCateC;
- (void) doSomething2ByCateC;

@end

动态分析进入 load_categories_nolock

打印输出 count ,catelist[i] 验证是不是要分析的对象

count = 3 说明有3 个 cate，分别是 catelist[0] = XKC ,catelist[1] = XKA,catelist[2] = XKB
继续分析进入 attachLists

得知 oldCount = 2 说明 array()->lists 里已经有2个数据了，我们输出来验证一下

继续往下走

打印输出 newArray->lists

由输出可知 oldlists ptr 倒序插入到了 newlists ptr，再把addedlists 插入到 newlists 第0个
最后得出的结论就是：当 oldArray() 里已有一个主类和至少有一个分类 会进入 if (hasArray()) 里，然后在0号位置继续插入新的分类。

👆的分析和探索都是基于主类 和 分类 都同时调用了 load 方法。那如果有一种情况不满足会如何呢？继续进行探索

load 探索

1.主类 load 分类 no load

动态调试日志👇

根据打印的日志，发现并没有进入 attachCategories

2.主类 no load 分类 load

动态调试日志👇

根据打印的日志，发现也没有进入 attachCategories

3.主类 no load 分类 no load

动态调试日志👇

在一次发生消息的时候时会进入到 realizeClassWithoutSwift

4.主类 load 多个分类不都有 load

可知流程 realizeClassWithoutSwift -> methodizeClass-> attachToClass -> attachCategories

5.主类 no load 多个分类不都有 load

可知流程 realizeClassWithoutSwift -> methodizeClass-> attachToClass -> attachCategories
这下我们就能得知 主类 虽然 no load,但 分类 load 了还是会进入 attachCategories 。有种被迫的感觉

那如果没有进入 attachCategories ，分类里的数据到底是何时加载到类里的呢？

进入 realizeClassWithoutSwift 分情况进行验证
1：主类 load 分类 no load
- 断点输出 ro 内部方法
2：主类 no load 分类 load
- 断点输出 ro 内部方法
3：主类 no load 分类 no load
- 断点输出 ro 内部方法

整理：1，2，3 在一开始就存在了 data() 里，都是在一开始已经从disk 里读取到了分类的数据。 data() 直接从 macho 里加载数据

总结

我们今天从 realizeClassWithoutSwift 会切入点，探索了分类到底是何时进行加载的
当在 主类 和 分类 同时调用 load 方法时，会进入一段很繁琐的加载流程
（_read_image -> realizeClassWithoutSwift -> methodizeClass-> attachToClass
load_images -> loadAllCategories -> load_categories_nolock -> attachCategories ）
目的就是把 存放list 的指针数组 加入到 rwe里
当在 主类 和 分类 不同时调用 load 方法时，就会从 macho 直接读取放到 data() 里，再从 data() 读取 rwe 数据
得出最主要的结论就是 在开发过程中尽量不要在主类和分类同时加载 load 方法

知识点补充

ro rw rwe

你是否会有同样的❓， ro rw rwe 到底是什么？为什么要有rwe呢？

之前在 iOS 底层学习5 这篇文章里也解释过，在 wwdc 2020 那个视频里也分析过，我再梳理一下

名称	解释	描述	来源
ro	`read only 只读`	运动时内存是不会发生变化，被称为 `clean memory：干净内存`	从 `disk` 读取
rw	`read write 读写`	运行时会被写入新的数据，因此它非常的 `昂贵`，被称为 `dirty memory 脏内存`	`ro`的 `copy`
rwe	`rw` 的扩展	`rw` 内存非常`昂贵`,`rwe` 优化 `rw`	`rw`

rwe 是怎么优化rw呢？

Apple develop 有个运行时机制，它能通过扩展 ,底层 api 动态增加 rw 的大小，而 rw 又十分的 昂贵。而且我们还发现并不是每个 class 都需要 扩展 或是 动态api ，只有在需要时候才去分配 内存，因此引入了 rwe。

cls ->Data() 探索

在realizeClassWithoutSwift 会有 auto ro = (const class_ro_t *)cls->data();
那为何cls->data() 就能转成成 ro 呢？

动态调试把程序运行看看 cls->data() 里到底是什么，断点来到👇

进入 data()

p 输出 bit->data, x/8gx 输出 cls 👇
发现 data() 就是地址指针，存放在 cls 里第5片内存里

那为何能转成 class_ro_t 这个结构呢？

这个最主要的原因就是 Apple 在 llvm编译时 做了数据匹配相应的处理。只要 地址指针里的格式 与 结构体的格式 是相互匹配的，就是进行 结构体指针 的赋值操作。
再举个 🌰

int main(int argc, const char * argv[]) {
    // runtime 调用 api ，读取 Method
    Method m = class_getInstanceMethod(XKStudent.class, @selector(teacherSay));
}

输出 👇

找到 Method 源码，自定义一个类似 Method 结构体

struct objc_method {
    SEL _Nonnull method_name;
    char * _Nullable method_types;
    IMP _Nonnull method_imp;
};
int main(int argc, const char * argv[]) {
     struct objc_method *method =  class_getInstanceMethod(XKStudent.class, @selector(teacherSay));
}

输出 👇

可知同一个指针地址 只要知道其内部结构,就能根据地址进行赋值
这下又知道了
- 1.指针地址不但能进行取值
- 2:指针地址还能再进行同一个格式的结构体指针进行相应的赋值