load和initialize

关于load和initialize加载都是老生常谈的问题，这里简单总结一下他们俩的区别和联系。
load的加载顺序其实是依据mach-o中dyld加载器的顺序而定的，具体到某一类的时候，又是和prepare的顺序有关系。
initialize则是按照send_msg方式进行的，所以它和普通的方法调用区别不大，只不过会在一个类所有方法执行之前检测一次，如果没有执行过就执行。

1. Mach-O

Unix标准了一个可移植的二进制格式ELF但是苹果并没有实现它而是维护了一套NeXTSTEP的遗物 Mach-Object简称Mach-O。
但是这并不是说苹果不遵守POSXI规范，这个规范通常说的是源码级别的跨平台性，对于二进制则不强制要求。下面放两个参考链接
mach-0
dyld

这里不主要讲解，大概记住，咱们所有的自己写的代码会被编译成Mach-O文件，然后使用dyld进行装载，load的加载顺序就是在装载的时候决定的。

2. load调用。

我们在代码中创建一个类Man，继承自Person，代码如下：

@interface Person : NSObject

@end
@implementation Person
+ (void)load{
    NSLog(@"%s",__func__);
}

- (void)dealloc{
    NSLog(@"%s",__func__);
}

@end


@interface Man : Person

@end
@implementation Man
+ (void)load{
    NSLog(@"%s",__func__);
}
+(void)initialize{
    NSLog(@"%s",__func__);
}

- (void)dealloc{
    NSLog(@"%s",__func__);
}

@end

@interface Person (catergory)

@end
@implementation Person (catergory)
+ (void)load{
    NSLog(@"%s",__func__);
}

@end

@interface Man(catorg)

@end

@implementation Man(catorg)
+ (void)load{
    NSLog(@"%s",__func__);
}
+(void)initialize{
    NSLog(@"%s",__func__);
}

@end

通过在Man里面的load方法打断点，可以看到一些调用栈信息。如下图：

image.png

其实从_dyld_start 到 dyld::notifySingle方法都是dyld进行对mach-o镜像的加载准备工作，当有新的镜像(image)加载进来之后，就会走load_images(其实是imageLoader加载器的一个回调)方法，load_image传入一个image的列表，load_images里面循环调用，去初始化其中的类对象，加载load方法等。具体代码如下附带注释：

load_images(const char *path __unused, const struct mach_header *mh)
{
//path即是mach-o的一个路径，mh就是mach-0的头部信息
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;

    recursive_mutex_locker_t lock(loadMethodLock);

    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        //这一步就是所谓的load装载工作
        prepare_load_methods((const headerType *)mh);
    }

    // Call +load methods (without runtimeLock - re-entrant)
    //这一步就是load调用
    call_load_methods();
}

撇开其他的代码，我们只看load相关的代码。从上面代码中可以看出，当进行load_image的时候，首先是prepare_load_methods将load方法进行准备工作，然后调用call_load_methods进行调用。

2.1 load的装载

prepare_load_methods其实就是首先完成了load的方法的预装载，我们先进入prepare_load_methods中，实现代码如下:

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertLocked();
  //获得当前image的类
//classref_t 其实就是还没有map到的class_t 
    classref_t const *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
// 循环将类对象里的load方法地址装入列表
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));//这里remapClass其实就是重定向class的地址。有兴趣可以继续研究里面的代码。
    }
//处理分类的方法
    category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        if (cls->isSwiftStable()) {
            _objc_fatal("Swift class extensions and categories on Swift "
                        "classes are not allowed to have +load methods");
        }
        realizeClassWithoutSwift(cls, nil);
        ASSERT(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);
    }
}

从上面代码中可以看出load装载的时候，首先是拿到images中的所有类，循环将其进行装载。这里分为两部分，一是本类的方法装载，二是分类中load方法的装载。

2.1.1 本类load方法装载

先看代码schedule_class_load方法:

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    ASSERT(cls->isRealized());  // _read_images should realize

    if (cls->data()->flags & RW_LOADED) return;

    // Ensure superclass-first ordering
    schedule_class_load(cls->superclass);

    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

本类方法装载其实就是使用schedule_class_load 方法进行加载，schedule_class_load其实是个迭代调用，每次总是先将superclass传入进行装载，这里就确定了superclass比class先行进如装载空间内。至于add_class_to_loadable_list方法其实里面是使用了一个全局数组来存储当前需要加载的load方法的类和load方法的method对象。代码如下，不做具体讲解了。

void add_class_to_loadable_list(Class cls)
{
    IMP method;

    loadMethodLock.assertLocked();

    method = cls->getLoadMethod();
    if (!method) return;  // Don't bother if cls has no +load method
    
    if (PrintLoading) {
        _objc_inform("LOAD: class '%s' scheduled for +load", 
                     cls->nameForLogging());
    }
    
    if (loadable_classes_used == loadable_classes_allocated) {
        loadable_classes_allocated = loadable_classes_allocated*2 + 16;
        loadable_classes = (struct loadable_class *)
            realloc(loadable_classes,
                              loadable_classes_allocated *
                              sizeof(struct loadable_class));
    }
    
    loadable_classes[loadable_classes_used].cls = cls;
    loadable_classes[loadable_classes_used].method = method;
    loadable_classes_used++;
}

2.1.2 分类load方法装载

分类装载load方法的步骤整体和本类差不多，只不过增加了一步类初始化的判断realizeClassWithoutSwift，分类装载load方法主要是使用add_category_to_loadable_list方法进行操作，代码如下：

void add_category_to_loadable_list(Category cat)
{
    IMP method;

    loadMethodLock.assertLocked();

    method = _category_getLoadMethod(cat);

    // Don't bother if cat has no +load method
    if (!method) return;

    if (PrintLoading) {
        _objc_inform("LOAD: category '%s(%s)' scheduled for +load", 
                     _category_getClassName(cat), _category_getName(cat));
    }
    
    if (loadable_categories_used == loadable_categories_allocated) {
        loadable_categories_allocated = loadable_categories_allocated*2 + 16;
        loadable_categories = (struct loadable_category *)
            realloc(loadable_categories,
                              loadable_categories_allocated *
                              sizeof(struct loadable_category));
    }

    loadable_categories[loadable_categories_used].cat = cat;
    loadable_categories[loadable_categories_used].method = method;
    loadable_categories_used++;
}

IMP 
_category_getLoadMethod(Category cat)
{
    runtimeLock.assertLocked();

    const method_list_t *mlist;

    mlist = cat->classMethods;
    if (mlist) {
        for (const auto& meth : *mlist) {
            const char *name = sel_cname(meth.name);
            if (0 == strcmp(name, "load")) {
                return meth.imp;
            }
        }
    }

    return nil;
}

也是获取method，得到IMP地址存储在list里，它和add_class_to_loadable_list的实现几乎一样，但是分类方法load方法存储在loadable_categories里面，和本类存储的不是一个“数组”。从这里就能看出，category的加载顺序就和文件的顺序有关了，和本类父类就没有关系了。

从上面我们基本得出三个结论：1. 本类对象装载的时候首先装载的是父类，然后是本类，2. category装载的时候是和文件顺序有关的。3. 本类对象load储存的数组和category存储的不在一起。

2.2 load调用

回到load_images里面，prepare_load_methods走完之后，紧接着就开始call_load_methods调用load方法，这里面就比较简单，因为前面已经将，本类，父类，分类的load方法装载入全局的“数组”里，这里只需要拿出来用就行了。代码如下:

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

这里也是分两步，首先调用本类的load方法，再调用category的方法。

2.2.1 本类load方法调用

代码如下

static void call_class_loads(void)
{
   int i;
   
   // Detach current loadable list.
   struct loadable_class *classes = loadable_classes;
   int used = loadable_classes_used;
   loadable_classes = nil;
   loadable_classes_allocated = 0;
   loadable_classes_used = 0;
   
   // Call all +loads for the detached list.
   for (i = 0; i < used; i++) {
       Class cls = classes[i].cls;
       load_method_t load_method = (load_method_t)classes[i].method;
       if (!cls) continue; 

       if (PrintLoading) {
           _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
       }
       (*load_method)(cls, @selector(load));
   }
   
   // Destroy the detached list.
   if (classes) free(classes);
}

从代码中可以看出，调用load方法，其实就是找到load方法地址直接调用。

2.2.2 category load方法调用

代码如下：

static bool call_category_loads(void)
{
    int i, shift;
    bool new_categories_added = NO;
    
    // Detach current loadable list.
//1.获取当前可以加载的分类列表
    struct loadable_category *cats = loadable_categories;
    int used = loadable_categories_used;
    int allocated = loadable_categories_allocated;
    loadable_categories = nil;
    loadable_categories_allocated = 0;
    loadable_categories_used = 0;

    // Call all +loads for the detached list.
//2.如果当前类是可加载的 cls && cls->isLoadable() 就会调用分类的 load 方法
    for (i = 0; i < used; i++) {
        Category cat = cats[i].cat;
        load_method_t load_method = (load_method_t)cats[i].method;
        Class cls;
        if (!cat) continue;

        cls = _category_getClass(cat);
        if (cls  &&  cls->isLoadable()) {
            if (PrintLoading) {
                _objc_inform("LOAD: +[%s(%s) load]\n", 
                             cls->nameForLogging(), 
                             _category_getName(cat));
            }
            (*load_method)(cls, @selector(load));
            cats[i].cat = nil;
        }
    }

//3.将所有加载过的分类移除 loadable_categories 列表
    // Compact detached list (order-preserving)
    shift = 0;
    for (i = 0; i < used; i++) {
        if (cats[i].cat) {
            cats[i-shift] = cats[i];
        } else {
            shift++;
        }
    }
    used -= shift;
//4.为 loadable_categories 重新分配内存，并重新设置它的值
    // Copy any new +load candidates from the new list to the detached list.
    new_categories_added = (loadable_categories_used > 0);
    for (i = 0; i < loadable_categories_used; i++) {
        if (used == allocated) {
            allocated = allocated*2 + 16;
            cats = (struct loadable_category *)
                realloc(cats, allocated *
                                  sizeof(struct loadable_category));
        }
        cats[used++] = loadable_categories[i];
    }

    // Destroy the new list.
    if (loadable_categories) free(loadable_categories);

    // Reattach the (now augmented) detached list. 
    // But if there's nothing left to load, destroy the list.
    if (used) {
        loadable_categories = cats;
        loadable_categories_used = used;
        loadable_categories_allocated = allocated;
    } else {
        if (cats) free(cats);
        loadable_categories = nil;
        loadable_categories_used = 0;
        loadable_categories_allocated = 0;
    }

    if (PrintLoading) {
        if (loadable_categories_used != 0) {
            _objc_inform("LOAD: %d categories still waiting for +load\n",
                         loadable_categories_used);
        }
    }

    return new_categories_added;
}

调用思路前半部分基本一样，但是分类后边增加了几步，大概如下：
1.获取当前可以加载的分类列表
2.如果当前类是可加载的 cls && cls->isLoadable() 就会调用分类的 load 方法
3.将所有加载过的分类移除 loadable_categories 列表
4.为 loadable_categories 重新分配内存，并重新设置它的值

load总结：1. load方法调用和类加载的顺序有关。2. load分类和本类load方法分别装在不同集合里。3. 当父类和子类同时实现时候，先调用父类，再调用子类。因为父类先装载。4. load方法调用是通过IMP指针直接调用的，没有走消息发送流程。

3. initialize调用。

我在Man类中对initialize打断点，方法调用站大概如下：

image.png

因为要创建对象首先需要调用alloc方法，所以我这里第一个方法是alloc方法。顺着代码顺序，我将源码贴出来:

static ALWAYS_INLINE id
callAlloc(Class cls, bool checkNil, bool allocWithZone=false)
{
#if __OBJC2__
    if (slowpath(checkNil && !cls)) return nil;
    if (fastpath(!cls->ISA()->hasCustomAWZ())) {
        return _objc_rootAllocWithZone(cls, nil);
    }
#endif

    // No shortcuts available.
    if (allocWithZone) {
        return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil);
    }
    return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc));
}

IMP lookUpImpOrForward(id inst, SEL sel, Class cls, int behavior)
{
    const IMP forward_imp = (IMP)_objc_msgForward_impcache;
    IMP imp = nil;
    Class curClass;

    runtimeLock.assertUnlocked();

    // Optimistic cache lookup
    if (fastpath(behavior & LOOKUP_CACHE)) {
        imp = cache_getImp(cls, sel);
        if (imp) goto done_nolock;
    }

    // runtimeLock is held during isRealized and isInitialized checking
    // to prevent races against concurrent realization.

    // runtimeLock is held during method search to make
    // method-lookup + cache-fill atomic with respect to method addition.
    // Otherwise, a category could be added but ignored indefinitely because
    // the cache was re-filled with the old value after the cache flush on
    // behalf of the category.

    runtimeLock.lock();

    // We don't want people to be able to craft a binary blob that looks like
    // a class but really isn't one and do a CFI attack.
    //
    // To make these harder we want to make sure this is a class that was
    // either built into the binary or legitimately registered through
    // objc_duplicateClass, objc_initializeClassPair or objc_allocateClassPair.
    //
    // TODO: this check is quite costly during process startup.
    checkIsKnownClass(cls);

    if (slowpath(!cls->isRealized())) {
        cls = realizeClassMaybeSwiftAndLeaveLocked(cls, runtimeLock);
        // runtimeLock may have been dropped but is now locked again
    }

    if (slowpath((behavior & LOOKUP_INITIALIZE) && !cls->isInitialized())) {
        cls = initializeAndLeaveLocked(cls, inst, runtimeLock);
        // runtimeLock may have been dropped but is now locked again

        // If sel == initialize, class_initialize will send +initialize and 
        // then the messenger will send +initialize again after this 
        // procedure finishes. Of course, if this is not being called 
        // from the messenger then it won't happen. 2778172
    }

    runtimeLock.assertLocked();
    curClass = cls;

    // The code used to lookpu the class's cache again right after
    // we take the lock but for the vast majority of the cases
    // evidence shows this is a miss most of the time, hence a time loss.
    //
    // The only codepath calling into this without having performed some
    // kind of cache lookup is class_getInstanceMethod().

    for (unsigned attempts = unreasonableClassCount();;) {
        // curClass method list.
        Method meth = getMethodNoSuper_nolock(curClass, sel);
        if (meth) {
            imp = meth->imp;
            goto done;
        }

        if (slowpath((curClass = curClass->superclass) == nil)) {
            // No implementation found, and method resolver didn't help.
            // Use forwarding.
            imp = forward_imp;
            break;
        }

        // Halt if there is a cycle in the superclass chain.
        if (slowpath(--attempts == 0)) {
            _objc_fatal("Memory corruption in class list.");
        }

        // Superclass cache.
        imp = cache_getImp(curClass, sel);
        if (slowpath(imp == forward_imp)) {
            // Found a forward:: entry in a superclass.
            // Stop searching, but don't cache yet; call method
            // resolver for this class first.
            break;
        }
        if (fastpath(imp)) {
            // Found the method in a superclass. Cache it in this class.
            goto done;
        }
    }

    // No implementation found. Try method resolver once.

    if (slowpath(behavior & LOOKUP_RESOLVER)) {
        behavior ^= LOOKUP_RESOLVER;
        return resolveMethod_locked(inst, sel, cls, behavior);
    }

 done:
    log_and_fill_cache(cls, imp, sel, inst, curClass);
    runtimeLock.unlock();
 done_nolock:
    if (slowpath((behavior & LOOKUP_NIL) && imp == forward_imp)) {
        return nil;
    }
    return imp;
}

static Class initializeAndLeaveLocked(Class cls, id obj, mutex_t& lock)
{
    return initializeAndMaybeRelock(cls, obj, lock, true);
}
static Class initializeAndMaybeRelock(Class cls, id inst,
                                      mutex_t& lock, bool leaveLocked)
{
    lock.assertLocked();
    ASSERT(cls->isRealized());

    if (cls->isInitialized()) {
        if (!leaveLocked) lock.unlock();
        return cls;
    }

    // Find the non-meta class for cls, if it is not already one.
    // The +initialize message is sent to the non-meta class object.
    Class nonmeta = getMaybeUnrealizedNonMetaClass(cls, inst);

    // Realize the non-meta class if necessary.
    if (nonmeta->isRealized()) {
        // nonmeta is cls, which was already realized
        // OR nonmeta is distinct, but is already realized
        // - nothing else to do
        lock.unlock();
    } else {
        nonmeta = realizeClassMaybeSwiftAndUnlock(nonmeta, lock);
        // runtimeLock is now unlocked
        // fixme Swift can't relocate the class today,
        // but someday it will:
        cls = object_getClass(nonmeta);
    }

    // runtimeLock is now unlocked, for +initialize dispatch
    ASSERT(nonmeta->isRealized());
    initializeNonMetaClass(nonmeta);

    if (leaveLocked) runtimeLock.lock();
    return cls;
}
void initializeNonMetaClass(Class cls)
{
    ASSERT(!cls->isMetaClass());

    Class supercls;
    bool reallyInitialize = NO;

    // Make sure super is done initializing BEFORE beginning to initialize cls.
    // See note about deadlock above.
    supercls = cls->superclass;
    if (supercls  &&  !supercls->isInitialized()) {
        initializeNonMetaClass(supercls);
    }
    
    // Try to atomically set CLS_INITIALIZING.
    SmallVector<_objc_willInitializeClassCallback, 1> localWillInitializeFuncs;
    {
        monitor_locker_t lock(classInitLock);
        if (!cls->isInitialized() && !cls->isInitializing()) {
            cls->setInitializing();
            reallyInitialize = YES;

            // Grab a copy of the will-initialize funcs with the lock held.
            localWillInitializeFuncs.initFrom(willInitializeFuncs);
        }
    }
    
    if (reallyInitialize) {
        // We successfully set the CLS_INITIALIZING bit. Initialize the class.
        
        // Record that we're initializing this class so we can message it.
        _setThisThreadIsInitializingClass(cls);

        if (MultithreadedForkChild) {
            // LOL JK we don't really call +initialize methods after fork().
            performForkChildInitialize(cls, supercls);
            return;
        }
        
        for (auto callback : localWillInitializeFuncs)
            callback.f(callback.context, cls);

        // Send the +initialize message.
        // Note that +initialize is sent to the superclass (again) if 
        // this class doesn't implement +initialize. 2157218
        if (PrintInitializing) {
            _objc_inform("INITIALIZE: thread %p: calling +[%s initialize]",
                         objc_thread_self(), cls->nameForLogging());
        }

        // Exceptions: A +initialize call that throws an exception 
        // is deemed to be a complete and successful +initialize.
        //
        // Only __OBJC2__ adds these handlers. !__OBJC2__ has a
        // bootstrapping problem of this versus CF's call to
        // objc_exception_set_functions().
#if __OBJC2__
        @try
#endif
        {
            callInitialize(cls);

            if (PrintInitializing) {
                _objc_inform("INITIALIZE: thread %p: finished +[%s initialize]",
                             objc_thread_self(), cls->nameForLogging());
            }
        }
#if __OBJC2__
        @catch (...) {
            if (PrintInitializing) {
                _objc_inform("INITIALIZE: thread %p: +[%s initialize] "
                             "threw an exception",
                             objc_thread_self(), cls->nameForLogging());
            }
            @throw;
        }
        @finally
#endif
        {
            // Done initializing.
            lockAndFinishInitializing(cls, supercls);
        }
        return;
    }
    
    else if (cls->isInitializing()) {
        // We couldn't set INITIALIZING because INITIALIZING was already set.
        // If this thread set it earlier, continue normally.
        // If some other thread set it, block until initialize is done.
        // It's ok if INITIALIZING changes to INITIALIZED while we're here, 
        //   because we safely check for INITIALIZED inside the lock 
        //   before blocking.
        if (_thisThreadIsInitializingClass(cls)) {
            return;
        } else if (!MultithreadedForkChild) {
            waitForInitializeToComplete(cls);
            return;
        } else {
            // We're on the child side of fork(), facing a class that
            // was initializing by some other thread when fork() was called.
            _setThisThreadIsInitializingClass(cls);
            performForkChildInitialize(cls, supercls);
        }
    }
    
    else if (cls->isInitialized()) {
        // Set CLS_INITIALIZING failed because someone else already 
        //   initialized the class. Continue normally.
        // NOTE this check must come AFTER the ISINITIALIZING case.
        // Otherwise: Another thread is initializing this class. ISINITIALIZED 
        //   is false. Skip this clause. Then the other thread finishes 
        //   initialization and sets INITIALIZING=no and INITIALIZED=yes. 
        //   Skip the ISINITIALIZING clause. Die horribly.
        return;
    }
    
    else {
        // We shouldn't be here. 
        _objc_fatal("thread-safe class init in objc runtime is buggy!");
    }
}

void callInitialize(Class cls)
{
    ((void(*)(Class, SEL))objc_msgSend)(cls, @selector(initialize));
    asm("");
}

贴了这么多其实就是最后一个方法管用，callInitialize，这里是调用initialize的方式，就是通过消息转发进行的。前面的方法里都是进行实例类对象一些初始化判断，调用判断等。
initialize 其实就是消息发送，它遵循消息发送的规则。

以上大概就是load方法和initialize的底层实现原理，总结如下：

load方法发生在main方法之前，使用的是地址直接调用，本类和分类都会调用。
initialize方法发生在main之后，使用消息发送实现，遵循消息发送规则，分类方法会覆盖本类方法。
他们俩都会先调用父类方法。

思考一个问题，Person,Animal,Man:Person 这三个类的实现顺序是这样的，load方法会怎么调用。

这里也有一个写得比较好的文章：https://zhuanlan.zhihu.com/p/20816991