关于load和initialize加载都是老生常谈的问题,这里简单总结一下他们俩的区别和联系。
load的加载顺序其实是依据mach-o中dyld加载器的顺序而定的,具体到某一类的时候,又是和prepare的顺序有关系。
initialize则是按照send_msg方式进行的,所以它和普通的方法调用区别不大,只不过会在一个类所有方法执行之前检测一次,如果没有执行过就执行。
1. Mach-O
Unix标准了一个可移植的二进制格式ELF但是苹果并没有实现它而是维护了一套NeXTSTEP的遗物 Mach-Object简称Mach-O。
但是这并不是说苹果不遵守POSXI规范,这个规范通常说的是源码级别的跨平台性,对于二进制则不强制要求。下面放两个参考链接
mach-0
dyld
这里不主要讲解,大概记住,咱们所有的自己写的代码会被编译成Mach-O文件,然后使用dyld进行装载,load的加载顺序就是在装载的时候决定的。
2. load调用。
我们在代码中创建一个类Man,继承自Person,代码如下:
@interface Person : NSObject
@end
@implementation Person
+ (void)load{
NSLog(@"%s",__func__);
}
- (void)dealloc{
NSLog(@"%s",__func__);
}
@end
@interface Man : Person
@end
@implementation Man
+ (void)load{
NSLog(@"%s",__func__);
}
+(void)initialize{
NSLog(@"%s",__func__);
}
- (void)dealloc{
NSLog(@"%s",__func__);
}
@end
@interface Person (catergory)
@end
@implementation Person (catergory)
+ (void)load{
NSLog(@"%s",__func__);
}
@end
@interface Man(catorg)
@end
@implementation Man(catorg)
+ (void)load{
NSLog(@"%s",__func__);
}
+(void)initialize{
NSLog(@"%s",__func__);
}
@end
通过在Man里面的load方法打断点,可以看到一些调用栈信息。如下图:
其实从_dyld_start 到 dyld::notifySingle方法都是dyld进行对mach-o镜像的加载准备工作,当有新的镜像(image)加载进来之后,就会走load_images(其实是imageLoader加载器的一个回调)方法,load_image传入一个image的列表,load_images里面循环调用,去初始化其中的类对象,加载load方法等。具体代码如下附带注释:
load_images(const char *path __unused, const struct mach_header *mh)
{
//path即是mach-o的一个路径,mh就是mach-0的头部信息
// Return without taking locks if there are no +load methods here.
if (!hasLoadMethods((const headerType *)mh)) return;
recursive_mutex_locker_t lock(loadMethodLock);
// Discover load methods
{
mutex_locker_t lock2(runtimeLock);
//这一步就是所谓的load装载工作
prepare_load_methods((const headerType *)mh);
}
// Call +load methods (without runtimeLock - re-entrant)
//这一步就是load调用
call_load_methods();
}
撇开其他的代码,我们只看load相关的代码。从上面代码中可以看出,当进行load_image的时候,首先是prepare_load_methods将load方法进行准备工作,然后调用call_load_methods进行调用。
2.1 load的装载
prepare_load_methods其实就是首先完成了load的方法的预装载,我们先进入prepare_load_methods中,实现代码如下:
void prepare_load_methods(const headerType *mhdr)
{
size_t count, i;
runtimeLock.assertLocked();
//获得当前image的类
//classref_t 其实就是还没有map到的class_t
classref_t const *classlist =
_getObjc2NonlazyClassList(mhdr, &count);
// 循环将类对象里的load方法地址装入列表
for (i = 0; i < count; i++) {
schedule_class_load(remapClass(classlist[i]));//这里remapClass其实就是重定向class的地址。有兴趣可以继续研究里面的代码。
}
//处理分类的方法
category_t * const *categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
for (i = 0; i < count; i++) {
category_t *cat = categorylist[i];
Class cls = remapClass(cat->cls);
if (!cls) continue; // category for ignored weak-linked class
if (cls->isSwiftStable()) {
_objc_fatal("Swift class extensions and categories on Swift "
"classes are not allowed to have +load methods");
}
realizeClassWithoutSwift(cls, nil);
ASSERT(cls->ISA()->isRealized());
add_category_to_loadable_list(cat);
}
}
从上面代码中可以看出load装载的时候,首先是拿到images中的所有类,循环将其进行装载。这里分为两部分,一是本类的方法装载,二是分类中load方法的装载。
2.1.1 本类load方法装载
先看代码schedule_class_load方法:
static void schedule_class_load(Class cls)
{
if (!cls) return;
ASSERT(cls->isRealized()); // _read_images should realize
if (cls->data()->flags & RW_LOADED) return;
// Ensure superclass-first ordering
schedule_class_load(cls->superclass);
add_class_to_loadable_list(cls);
cls->setInfo(RW_LOADED);
}
本类方法装载其实就是使用schedule_class_load 方法进行加载,schedule_class_load其实是个迭代调用,每次总是先将superclass传入进行装载,这里就确定了superclass比class先行进如装载空间内。至于add_class_to_loadable_list方法 其实里面是使用了一个全局数组来存储当前需要加载的load方法的类和load方法的method对象。代码如下,不做具体讲解了。
void add_class_to_loadable_list(Class cls)
{
IMP method;
loadMethodLock.assertLocked();
method = cls->getLoadMethod();
if (!method) return; // Don't bother if cls has no +load method
if (PrintLoading) {
_objc_inform("LOAD: class '%s' scheduled for +load",
cls->nameForLogging());
}
if (loadable_classes_used == loadable_classes_allocated) {
loadable_classes_allocated = loadable_classes_allocated*2 + 16;
loadable_classes = (struct loadable_class *)
realloc(loadable_classes,
loadable_classes_allocated *
sizeof(struct loadable_class));
}
loadable_classes[loadable_classes_used].cls = cls;
loadable_classes[loadable_classes_used].method = method;
loadable_classes_used++;
}
2.1.2 分类load方法装载
分类装载load方法的步骤整体和本类差不多,只不过增加了一步类初始化的判断realizeClassWithoutSwift,分类装载load方法主要是使用add_category_to_loadable_list方法进行操作,代码如下:
void add_category_to_loadable_list(Category cat)
{
IMP method;
loadMethodLock.assertLocked();
method = _category_getLoadMethod(cat);
// Don't bother if cat has no +load method
if (!method) return;
if (PrintLoading) {
_objc_inform("LOAD: category '%s(%s)' scheduled for +load",
_category_getClassName(cat), _category_getName(cat));
}
if (loadable_categories_used == loadable_categories_allocated) {
loadable_categories_allocated = loadable_categories_allocated*2 + 16;
loadable_categories = (struct loadable_category *)
realloc(loadable_categories,
loadable_categories_allocated *
sizeof(struct loadable_category));
}
loadable_categories[loadable_categories_used].cat = cat;
loadable_categories[loadable_categories_used].method = method;
loadable_categories_used++;
}
IMP
_category_getLoadMethod(Category cat)
{
runtimeLock.assertLocked();
const method_list_t *mlist;
mlist = cat->classMethods;
if (mlist) {
for (const auto& meth : *mlist) {
const char *name = sel_cname(meth.name);
if (0 == strcmp(name, "load")) {
return meth.imp;
}
}
}
return nil;
}
也是获取method,得到IMP地址存储在list里,它和add_class_to_loadable_list的实现几乎一样,但是分类方法load方法存储在loadable_categories里面,和本类存储的不是一个“数组”。从这里就能看出,category的加载顺序就和文件的顺序有关了,和本类父类就没有关系了。
从上面我们基本得出三个结论:1. 本类对象装载的时候首先装载的是父类,然后是本类,2. category装载的时候是和文件顺序有关的。3. 本类对象load储存的数组和category存储的不在一起。
2.2 load调用
回到load_images里面,prepare_load_methods走完之后,紧接着就开始call_load_methods调用load方法,这里面就比较简单,因为前面已经将,本类,父类,分类的load方法装载入全局的“数组”里,这里只需要拿出来用就行了。代码如下:
void call_load_methods(void)
{
static bool loading = NO;
bool more_categories;
loadMethodLock.assertLocked();
// Re-entrant calls do nothing; the outermost call will finish the job.
if (loading) return;
loading = YES;
void *pool = objc_autoreleasePoolPush();
do {
// 1. Repeatedly call class +loads until there aren't any more
while (loadable_classes_used > 0) {
call_class_loads();
}
// 2. Call category +loads ONCE
more_categories = call_category_loads();
// 3. Run more +loads if there are classes OR more untried categories
} while (loadable_classes_used > 0 || more_categories);
objc_autoreleasePoolPop(pool);
loading = NO;
}
这里也是分两步,首先调用本类的load方法,再调用category的方法。
2.2.1 本类load方法调用
代码如下
static void call_class_loads(void)
{
int i;
// Detach current loadable list.
struct loadable_class *classes = loadable_classes;
int used = loadable_classes_used;
loadable_classes = nil;
loadable_classes_allocated = 0;
loadable_classes_used = 0;
// Call all +loads for the detached list.
for (i = 0; i < used; i++) {
Class cls = classes[i].cls;
load_method_t load_method = (load_method_t)classes[i].method;
if (!cls) continue;
if (PrintLoading) {
_objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
}
(*load_method)(cls, @selector(load));
}
// Destroy the detached list.
if (classes) free(classes);
}
从代码中可以看出,调用load方法,其实就是找到load方法地址直接调用。
2.2.2 category load方法调用
代码如下:
static bool call_category_loads(void)
{
int i, shift;
bool new_categories_added = NO;
// Detach current loadable list.
//1.获取当前可以加载的分类列表
struct loadable_category *cats = loadable_categories;
int used = loadable_categories_used;
int allocated = loadable_categories_allocated;
loadable_categories = nil;
loadable_categories_allocated = 0;
loadable_categories_used = 0;
// Call all +loads for the detached list.
//2.如果当前类是可加载的 cls && cls->isLoadable() 就会调用分类的 load 方法
for (i = 0; i < used; i++) {
Category cat = cats[i].cat;
load_method_t load_method = (load_method_t)cats[i].method;
Class cls;
if (!cat) continue;
cls = _category_getClass(cat);
if (cls && cls->isLoadable()) {
if (PrintLoading) {
_objc_inform("LOAD: +[%s(%s) load]\n",
cls->nameForLogging(),
_category_getName(cat));
}
(*load_method)(cls, @selector(load));
cats[i].cat = nil;
}
}
//3.将所有加载过的分类移除 loadable_categories 列表
// Compact detached list (order-preserving)
shift = 0;
for (i = 0; i < used; i++) {
if (cats[i].cat) {
cats[i-shift] = cats[i];
} else {
shift++;
}
}
used -= shift;
//4.为 loadable_categories 重新分配内存,并重新设置它的值
// Copy any new +load candidates from the new list to the detached list.
new_categories_added = (loadable_categories_used > 0);
for (i = 0; i < loadable_categories_used; i++) {
if (used == allocated) {
allocated = allocated*2 + 16;
cats = (struct loadable_category *)
realloc(cats, allocated *
sizeof(struct loadable_category));
}
cats[used++] = loadable_categories[i];
}
// Destroy the new list.
if (loadable_categories) free(loadable_categories);
// Reattach the (now augmented) detached list.
// But if there's nothing left to load, destroy the list.
if (used) {
loadable_categories = cats;
loadable_categories_used = used;
loadable_categories_allocated = allocated;
} else {
if (cats) free(cats);
loadable_categories = nil;
loadable_categories_used = 0;
loadable_categories_allocated = 0;
}
if (PrintLoading) {
if (loadable_categories_used != 0) {
_objc_inform("LOAD: %d categories still waiting for +load\n",
loadable_categories_used);
}
}
return new_categories_added;
}
调用思路前半部分基本一样,但是分类后边增加了几步,大概如下:
1.获取当前可以加载的分类列表
2.如果当前类是可加载的 cls && cls->isLoadable() 就会调用分类的 load 方法
3.将所有加载过的分类移除 loadable_categories 列表
4.为 loadable_categories 重新分配内存,并重新设置它的值
load总结:1. load方法调用和类加载的顺序有关。2. load分类和本类load方法分别装在不同集合里。3. 当父类和子类同时实现时候,先调用父类,再调用子类。因为父类先装载。4. load方法调用是通过IMP指针直接调用的,没有走消息发送流程。
3. initialize调用。
我在Man类中对initialize打断点,方法调用站大概如下:
因为要创建对象首先需要调用alloc方法,所以我这里第一个方法是alloc方法。顺着代码顺序,我将源码贴出来:
static ALWAYS_INLINE id
callAlloc(Class cls, bool checkNil, bool allocWithZone=false)
{
#if __OBJC2__
if (slowpath(checkNil && !cls)) return nil;
if (fastpath(!cls->ISA()->hasCustomAWZ())) {
return _objc_rootAllocWithZone(cls, nil);
}
#endif
// No shortcuts available.
if (allocWithZone) {
return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil);
}
return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc));
}
IMP lookUpImpOrForward(id inst, SEL sel, Class cls, int behavior)
{
const IMP forward_imp = (IMP)_objc_msgForward_impcache;
IMP imp = nil;
Class curClass;
runtimeLock.assertUnlocked();
// Optimistic cache lookup
if (fastpath(behavior & LOOKUP_CACHE)) {
imp = cache_getImp(cls, sel);
if (imp) goto done_nolock;
}
// runtimeLock is held during isRealized and isInitialized checking
// to prevent races against concurrent realization.
// runtimeLock is held during method search to make
// method-lookup + cache-fill atomic with respect to method addition.
// Otherwise, a category could be added but ignored indefinitely because
// the cache was re-filled with the old value after the cache flush on
// behalf of the category.
runtimeLock.lock();
// We don't want people to be able to craft a binary blob that looks like
// a class but really isn't one and do a CFI attack.
//
// To make these harder we want to make sure this is a class that was
// either built into the binary or legitimately registered through
// objc_duplicateClass, objc_initializeClassPair or objc_allocateClassPair.
//
// TODO: this check is quite costly during process startup.
checkIsKnownClass(cls);
if (slowpath(!cls->isRealized())) {
cls = realizeClassMaybeSwiftAndLeaveLocked(cls, runtimeLock);
// runtimeLock may have been dropped but is now locked again
}
if (slowpath((behavior & LOOKUP_INITIALIZE) && !cls->isInitialized())) {
cls = initializeAndLeaveLocked(cls, inst, runtimeLock);
// runtimeLock may have been dropped but is now locked again
// If sel == initialize, class_initialize will send +initialize and
// then the messenger will send +initialize again after this
// procedure finishes. Of course, if this is not being called
// from the messenger then it won't happen. 2778172
}
runtimeLock.assertLocked();
curClass = cls;
// The code used to lookpu the class's cache again right after
// we take the lock but for the vast majority of the cases
// evidence shows this is a miss most of the time, hence a time loss.
//
// The only codepath calling into this without having performed some
// kind of cache lookup is class_getInstanceMethod().
for (unsigned attempts = unreasonableClassCount();;) {
// curClass method list.
Method meth = getMethodNoSuper_nolock(curClass, sel);
if (meth) {
imp = meth->imp;
goto done;
}
if (slowpath((curClass = curClass->superclass) == nil)) {
// No implementation found, and method resolver didn't help.
// Use forwarding.
imp = forward_imp;
break;
}
// Halt if there is a cycle in the superclass chain.
if (slowpath(--attempts == 0)) {
_objc_fatal("Memory corruption in class list.");
}
// Superclass cache.
imp = cache_getImp(curClass, sel);
if (slowpath(imp == forward_imp)) {
// Found a forward:: entry in a superclass.
// Stop searching, but don't cache yet; call method
// resolver for this class first.
break;
}
if (fastpath(imp)) {
// Found the method in a superclass. Cache it in this class.
goto done;
}
}
// No implementation found. Try method resolver once.
if (slowpath(behavior & LOOKUP_RESOLVER)) {
behavior ^= LOOKUP_RESOLVER;
return resolveMethod_locked(inst, sel, cls, behavior);
}
done:
log_and_fill_cache(cls, imp, sel, inst, curClass);
runtimeLock.unlock();
done_nolock:
if (slowpath((behavior & LOOKUP_NIL) && imp == forward_imp)) {
return nil;
}
return imp;
}
static Class initializeAndLeaveLocked(Class cls, id obj, mutex_t& lock)
{
return initializeAndMaybeRelock(cls, obj, lock, true);
}
static Class initializeAndMaybeRelock(Class cls, id inst,
mutex_t& lock, bool leaveLocked)
{
lock.assertLocked();
ASSERT(cls->isRealized());
if (cls->isInitialized()) {
if (!leaveLocked) lock.unlock();
return cls;
}
// Find the non-meta class for cls, if it is not already one.
// The +initialize message is sent to the non-meta class object.
Class nonmeta = getMaybeUnrealizedNonMetaClass(cls, inst);
// Realize the non-meta class if necessary.
if (nonmeta->isRealized()) {
// nonmeta is cls, which was already realized
// OR nonmeta is distinct, but is already realized
// - nothing else to do
lock.unlock();
} else {
nonmeta = realizeClassMaybeSwiftAndUnlock(nonmeta, lock);
// runtimeLock is now unlocked
// fixme Swift can't relocate the class today,
// but someday it will:
cls = object_getClass(nonmeta);
}
// runtimeLock is now unlocked, for +initialize dispatch
ASSERT(nonmeta->isRealized());
initializeNonMetaClass(nonmeta);
if (leaveLocked) runtimeLock.lock();
return cls;
}
void initializeNonMetaClass(Class cls)
{
ASSERT(!cls->isMetaClass());
Class supercls;
bool reallyInitialize = NO;
// Make sure super is done initializing BEFORE beginning to initialize cls.
// See note about deadlock above.
supercls = cls->superclass;
if (supercls && !supercls->isInitialized()) {
initializeNonMetaClass(supercls);
}
// Try to atomically set CLS_INITIALIZING.
SmallVector<_objc_willInitializeClassCallback, 1> localWillInitializeFuncs;
{
monitor_locker_t lock(classInitLock);
if (!cls->isInitialized() && !cls->isInitializing()) {
cls->setInitializing();
reallyInitialize = YES;
// Grab a copy of the will-initialize funcs with the lock held.
localWillInitializeFuncs.initFrom(willInitializeFuncs);
}
}
if (reallyInitialize) {
// We successfully set the CLS_INITIALIZING bit. Initialize the class.
// Record that we're initializing this class so we can message it.
_setThisThreadIsInitializingClass(cls);
if (MultithreadedForkChild) {
// LOL JK we don't really call +initialize methods after fork().
performForkChildInitialize(cls, supercls);
return;
}
for (auto callback : localWillInitializeFuncs)
callback.f(callback.context, cls);
// Send the +initialize message.
// Note that +initialize is sent to the superclass (again) if
// this class doesn't implement +initialize. 2157218
if (PrintInitializing) {
_objc_inform("INITIALIZE: thread %p: calling +[%s initialize]",
objc_thread_self(), cls->nameForLogging());
}
// Exceptions: A +initialize call that throws an exception
// is deemed to be a complete and successful +initialize.
//
// Only __OBJC2__ adds these handlers. !__OBJC2__ has a
// bootstrapping problem of this versus CF's call to
// objc_exception_set_functions().
#if __OBJC2__
@try
#endif
{
callInitialize(cls);
if (PrintInitializing) {
_objc_inform("INITIALIZE: thread %p: finished +[%s initialize]",
objc_thread_self(), cls->nameForLogging());
}
}
#if __OBJC2__
@catch (...) {
if (PrintInitializing) {
_objc_inform("INITIALIZE: thread %p: +[%s initialize] "
"threw an exception",
objc_thread_self(), cls->nameForLogging());
}
@throw;
}
@finally
#endif
{
// Done initializing.
lockAndFinishInitializing(cls, supercls);
}
return;
}
else if (cls->isInitializing()) {
// We couldn't set INITIALIZING because INITIALIZING was already set.
// If this thread set it earlier, continue normally.
// If some other thread set it, block until initialize is done.
// It's ok if INITIALIZING changes to INITIALIZED while we're here,
// because we safely check for INITIALIZED inside the lock
// before blocking.
if (_thisThreadIsInitializingClass(cls)) {
return;
} else if (!MultithreadedForkChild) {
waitForInitializeToComplete(cls);
return;
} else {
// We're on the child side of fork(), facing a class that
// was initializing by some other thread when fork() was called.
_setThisThreadIsInitializingClass(cls);
performForkChildInitialize(cls, supercls);
}
}
else if (cls->isInitialized()) {
// Set CLS_INITIALIZING failed because someone else already
// initialized the class. Continue normally.
// NOTE this check must come AFTER the ISINITIALIZING case.
// Otherwise: Another thread is initializing this class. ISINITIALIZED
// is false. Skip this clause. Then the other thread finishes
// initialization and sets INITIALIZING=no and INITIALIZED=yes.
// Skip the ISINITIALIZING clause. Die horribly.
return;
}
else {
// We shouldn't be here.
_objc_fatal("thread-safe class init in objc runtime is buggy!");
}
}
void callInitialize(Class cls)
{
((void(*)(Class, SEL))objc_msgSend)(cls, @selector(initialize));
asm("");
}
贴了这么多其实就是最后一个方法管用,callInitialize,这里是调用initialize的方式,就是通过消息转发进行的。前面的方法里都是进行实例类对象一些初始化判断,调用判断等。
initialize 其实就是消息发送,它遵循消息发送的规则。
以上大概就是load方法和initialize的底层实现原理,总结如下:
- load方法发生在main方法之前,使用的是地址直接调用,本类和分类都会调用。
- initialize方法发生在main之后,使用消息发送实现,遵循消息发送规则,分类方法会覆盖本类方法。
- 他们俩都会先调用父类方法。
思考一个问题,Person,Animal,Man:Person 这三个类的实现顺序是这样的,load方法会怎么调用。
这里也有一个写得比较好的文章:https://zhuanlan.zhihu.com/p/20816991