iOS 底层第12天的学习。今天终于要进入下一个新的篇章了,你是否还记得在 iOS 底层学习1.1 的时候有一个程序加载流程:dyld_start -> dyld::main->dyld:initializeMainExecutable ->libSystem_initializer。 而这个新的篇章就是 dyld
what is dyld ?
-
我们先看一下这个图
- 图上的
链接 = dyld。程序员编写的代码(源文件)->编译->汇编,再通过链接的形式把这些和静动库串联起来,最终生成可执行文件。 - 名词解释:
dyld(the dynamic link editor)是苹果的动态链接器,是苹果操作系统一个重要组成部分.
从名词解释可知
dyld是动态库链接,那到底是怎么动态链接的呢?接下来我们从 dyld源码 入手开始分析
那源码有那么多,要分析dyld切入点是什么呢?
dyld_start
- 已知
dyld_start是整个程序运行的入口,以入口作为切入点开始探索最适合不过了,打开dyld源码工程全局搜索dyld_start
- 开始进入汇编模式,有个注解
call dyldbootstrap::start这里的dyldbootstrap是c++函数的命名空间,start是命名空间里的方法。全局搜索dyldbootstrap&start
namespace dyldbootstrap {
...
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// ...
// bootstrapping dyld
_subsystem_init(apple);
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide();
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
}
- 找到
return后最重要的函数dyld::_main直接进入

- 缩进一看代码,
1000多行左右,但我们只要找到最主要的代码就行了。 - 我们现在已经找到了
dyld::main,找到这个的目的是什么呢? 就是为了分析dyld是怎么链接镜像文件以及它的一个主流程。
//
// Entry point for dyld. The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
int argc, const char* argv[], const char* envp[], const char* apple[],
uintptr_t* startGlue)
{
// ... 省略部分代码
// 主程序执行的一些信息的处理
getHostInfo(mainExecutableMH, mainExecutableSlide);
// Set the platform ID in the all image infos so debuggers can tell the process type
{ ... }
// Check to see if we need to override the platform.
// 一些 dyld_root_path 的处理
{ ... }
// 配置执行文件的操作处理
configureProcessRestrictions(mainExecutableMH, envp);
// Check if we should force dyld3. Note we have to do this outside of the regular env parsing due to AMFI
{ ... }
// load shared cache
checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide); // 系统级别,共享缓存处理
// ... 省略部分代码
{
// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
if ( result != 0 ) {
// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
else
halt("libdyld.dylib support not present for LC_MAIN");
}
else {
// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
*startGlue = 0;
}
}
return result
}
- 可知
main函数返回result, 根据result找到sMainExecutable,根据这个sMainExecutable继续探究
// ...
// instantiate ImageLoader for main executable
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
// sMainExecutable 的初始化
// load any inserted libraries
if ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
loadInsertedDylib(*lib);
}
// link main executable
// {...}
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
// {...}
if (sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
image->setNeverUnloadRecursive();
}
}
// {...}
// <rdar://problem/12186933>
// do weak binding only after all inserted images linked
sMainExecutable->weakBind(gLinkContext);
// {...}
// run all initializer
initializeMainExecutable();
// {...}
// notify any montoring proccesses that this process is about to enter main()
notifyMonitoringDyldMain();
-
sMainExecutable -> instantiateFromLoadedImage实例化主程序,镜像文件的加载 -
load inserted dylib加载插入的动态库 -
link main executable链接主程序 -
link any inserted libraries链接插入的动态库 -
sMainExecutable -> weakBind弱引用绑定主程序 -
initializeMainExecutable初始化,运行主程序 -
notifyMonitoringDyldMain通知dyld可以进行main函数
这里面最重要的就是
initializeMainExecutable,initializeMainExecutable里到底是如何实现?
- 继续探索
initializeMainExecutable
void initializeMainExecutable()
{
// record that we've reached this step
gLinkContext.startedInitializingMainExecutable = true;
// run initialzers for any inserted dylibs
ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
initializerTimes[0].count = 0;
const size_t rootCount = sImageRoots.size();
if ( rootCount > 1 ) {
for(size_t i=1; i < rootCount; ++i) {
sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
}
}
// 镜像文件的初始化
// run initializers for main executable and everything it brings up
sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
// { ... }
}
- 进入
ImageLoader::runInitializers探索
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
uint64_t t1 = mach_absolute_time();
mach_port_t thisThread = mach_thread_self();
ImageLoader::UninitedUpwards up;
up.count = 1;
up.imagesAndPaths[0] = { this, this->getPath() };
// 执行 Initializers
processInitializers(context, thisThread, timingInfo, up);
context.notifyBatch(dyld_image_state_initialized, false);
//
mach_port_deallocate(mach_task_self(), thisThread);
uint64_t t2 = mach_absolute_time();
fgTotalInitTime += (t2 - t1);
}
- 进入
ImageLoader::processInitializers探索
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
uint32_t maxImageCount = context.imageCount()+2;
ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
ImageLoader::UninitedUpwards& ups = upsBuffer[0];
ups.count = 0;
// Calling recursive init on all images in images list, building a new list of
// uninitialized upward dependencies.
for (uintptr_t i=0; i < images.count; ++i) {
images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
}
// { ... }
}
- 进入
ImageLoader:: recursiveInitialization探索
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
try {
// initialize lower level libraries first
for(unsigned int i=0; i < libraryCount(); ++i) {
ImageLoader* dependentImage = libImage(i);
if ( dependentImage != NULL ) {
// don't try to initialize stuff "above" me yet
if ( libIsUpward(i) ) {
uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
uninitUps.count++;
}
else if ( dependentImage->fDepth >= fDepth ) {
// 依赖文件的加载
dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
}
}
}
// 核心代码
context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
// initialize this image
bool hasInitializers = this->doInitialization(context);
// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this, NULL);
// { ... }
}
}
- 全局 搜索
notifySingle, 寻找notifySingle是在何时进行赋值的,以及是如何实现的 -
notifySingle赋值
gLinkContext.notifySingle = ¬ifySingle;
-
notifySingle实现
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
if ( handlers != NULL ) {
dyld_image_info info;
info.imageLoadAddress = image->machHeader();
info.imageFilePath = image->getRealPath();
info.imageFileModDate = image->lastModified();
for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) {
// ...
}
}
if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
uint64_t t0 = mach_absolute_time();
dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
// ---- 核心 sNotifyObjCInit
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
// ----
uint64_t t1 = mach_absolute_time();
uint64_t t2 = mach_absolute_time();
uint64_t timeInObjC = t1-t0;
uint64_t emptyTime = (t2-t1)*100;
if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
timingInfo->addTime(image->getShortName(), timeInObjC);
}
}
// mach message csdlc about dynamically unloaded images
// { ... }
}
- 全局搜索
sNotifyObjCInit
static _dyld_objc_notify_init sNotifyObjCInit;
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init;
sNotifyObjCUnmapped = unmapped;
}
- 全局搜索
registerObjCNotifiers
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped,
_dyld_objc_notify_init init,
_dyld_objc_notify_unmapped unmapped)
{
dyld::registerObjCNotifiers(mapped, init, unmapped);
}
- 先在这里停一下,我们在👆一直不断的探索看源码,找到关键核心代码 -> 继续探索。到最后我们找到了
_dyld_objc_notify_register - 得出的一个结论就是在
initializeMainExecutable->ImageLoader: runInitializers->notifySingle->_dyld_objc_notify_register
那为什么
ImageLoader:: recursiveInitialization进行反向推导会来到_dyld_objc_notify_register?
dyld_objc_notify_register
- 这时我们已经无法静态去分析
dyld_objc_notify_register那该怎么办呢? - 根据经验我们可以添加去动态分析——把程序跑起来。
- 添加
dyld_objc_notify_register符号断点

-
bt打印堆栈 ,我们发现在libdyld.dylib中调用了_dyld_objc_notify_register
- 在
libobjc源码中全局搜索_dyld_objc_notify_register发现在_objc_init里会调用_dyld_objc_notify_register
void _objc_init(void)
{
//... 省略 一些 init 的方法
_imp_implementationWithBlock_init();
_dyld_objc_notify_register(&map_images, load_images, unmap_image);
#if __OBJC2__
didCallDyldNotifyRegister = true;
#endif
}
- 之前已经得知
dyld_start->...->ImageLoader:: recursiveInitialization进行反向推导会来到_dyld_objc_notify_register - 而在
_objc_init初始化时也会调用_dyld_objc_notify_register
这时又会有个新的问题就是
dyld_start和_objc_init到底是什么关系呢?
-
我们再把程序运行起来
bt打印堆栈
根据堆栈信息.
2️⃣ libdispatch.dylib:_os_object_init->1️⃣ libobjc.A.dylib: _objc_init-
直接查找
libdispatch.dylib源码_os_object_init进行验证
验证
3️⃣ libdispatch.dylib:libdispatch_init->2️⃣ libdispatch.dylib:_os_object_init

- 验证
4️⃣ libSystem.B.dylib :libSystem_initializer->3️⃣ libdispatch.dylib:libdispatch_init

- 验证
5️⃣ dyld:ImageLoaderMachO::doModInitFunctions->4️⃣ libSystem.B.dylib :libSystem_initializer
void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
// 核心代码
// libSystem initializer 必须是第一次加载,否就报错
if ( ! dyld::gProcessInfo->libSystemInitialized ) {
// <rdar://problem/17973316> libSystem initializer must run first
const char* installPath = getInstallPath();
if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) != 0) )
dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath());
}
// now safe to use malloc() and other calls in libSystem.dylib
dyld::gProcessInfo->libSystemInitialized = true;
}
- 根据👆我们可以推导出如下流程

- 在
dyld源码中全局搜索doModInitFunctions
bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
CRSetCrashLogMessage2(this->getPath());
// mach-o has -init and static initializers
doImageInit(context);
doModInitFunctions(context);
CRSetCrashLogMessage2(NULL);
return (fHasDashInit || fHasInitializers);
}
- 在
doInitialization中会调用doModInitFunctions, 继续搜索doInitialization看看在哪里会调用
void ImageLoader::recursiveInitialization {
// let objc know we are about to initialize this image
uint64_t t1 = mach_absolute_time();
fState = dyld_image_state_dependents_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
// initialize this image
bool hasInitializers = this->doInitialization(context);
// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this, NULL);
}
- 我们发现在
ImageLoader::recursiveInitialization中会调用doInitialization,
而doInitialization->doModInitFunctions-> ... ->_objc_init - 这下整个流程都通了,最终形成了一个闭环

根据上面的种种分析,现在终于就能解决为什么
ImageLoader:: recursiveInitialization进行反推导会来到_dyld_objc_notify_register在
_objc_init执行_dyld_objc_notify_register就是一个反向回调,把map_images,load_iamges,unmapped_image三个参数传进去
那为何要进行反向回调?
- 因为在
dyld链接images,它无法确定images在何时能够加载完成,此时就在notifySingle下了一个句柄,当dyld_image_state_initialized = true了,就在alloc_init调用_dyld_objc_notify_register传入三个参数,根据参数的内容来进行调用执行




