1 zygote进程简介
在Android系统中,DVM(Dalvik虚拟机)和ART、应用程序进程以及运行系统的关键服务的SystemServer进程都是由Zygote进程创建的,也可以将其称之为孵化器,它通过fork(复制进程)的形式来创建应用程序进程和SystemServer进程。
Zygote进程是在init进程启动时创建的,起初Zygote的进程名称为"app_process",在frameworks/base/cmds/app_process/Android.bp中定义,Zygote启动后,会将其名称转换为"zygote"或"zygote64"
2 zygote进程的启动
init进程通过解析init.rc文件将service的信息保存到svc_list中
首先查看init.rc文件
import /system/etc/init/hw/init.${ro.zygote}.rc
通过系统属性ro.zygote的值决定导入跟zygote相关的rc文件
eg init.zygote64.rc为例
service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server
class main
priority -20
user root
group root readproc reserved_disk
socket zygote stream 660 root system
socket usap_pool_primary stream 660 root system
onrestart exec_background - system system -- /system/bin/vdc volume abort_fuse
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart media.tuner
onrestart restart netd
onrestart restart wificond
task_profiles ProcessCapacityHigh
critical window=${zygote.critical_window.minute:-off} target=zygote-fatal
on late-init
trigger zygote-start
on zygote-start && property:ro.crypto.state=unencrypted
wait_for_prop odsign.verification.done 1
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start statsd
start netd
start zygote
start zygote_secondary
通过init进程的解析 zygote服务和相关参数被加载到集合列表中
在init进程按序处理Action,执行到late-init阶段,会触发zygote-start,这时在系统属性ro.crypto.state=unencrypted时,就可以start zygote command 执行完命令后会执行restart_process函数来重启服务进程这时从service_list中根据class类型 启动服务名称找到对应的service 执行start函数
3 zygote c篇
3.1
代码路径 service.cpp
Result<void> Service::Start() {
auto reboot_on_failure = make_scope_guard([this] {
if (on_failure_reboot_target_) {
trigger_shutdown(*on_failure_reboot_target_);
}
});
//判断service集合是否存在变化的service 是判断服务能否起来的条件
if (is_updatable() && !ServiceList::GetInstance().IsServicesUpdated()) {
ServiceList::GetInstance().DelayService(*this);
return Error() << "Cannot start an updatable service '" << name_
<< "' before configs from APEXes are all loaded. "
<< "Queued for execution.";
}
//当前是否禁止启动服务
bool disabled = (flags_ & (SVC_DISABLED | SVC_RESET));
ResetFlagsForStart();
// Running processes require no additional work --- if they're in the
// process of exiting, we've ensured that they will immediately restart
// on exit, unless they are ONESHOT. For ONESHOT service, if it's in
// stopping status, we just set SVC_RESTART flag so it will get restarted
// in Reap().
//1 判断服务flag当前处于运行状态
//2 当前服务flag支持一次启动
//3 如果当前服务处于运行状态 退出启动
if (flags_ & SVC_RUNNING) {
if ((flags_ & SVC_ONESHOT) && disabled) {
flags_ |= SVC_RESTART;
}
LOG(INFO) << "service '" << name_
<< "' requested start, but it is already running (flags: " << flags_ << ")";
// It is not an error to try to start a service that is already running.
reboot_on_failure.Disable();
return {};
}
std::unique_ptr<std::array<int, 2>, decltype(&ClosePipe)> pipefd(new std::array<int, 2>{-1, -1},
ClosePipe);
if (pipe(pipefd->data()) < 0) {
return ErrnoError() << "pipe()";
}
if (Result<void> result = CheckConsole(); !result.ok()) {
return result;
}
//查看可执行文件当前是否存在 不存在 禁止启动服务
struct stat sb;
if (stat(args_[0].c_str(), &sb) == -1) {
flags_ |= SVC_DISABLED;
return ErrnoError() << "Cannot find '" << args_[0] << "'";
}
//获取启动的服务的selnux安全上下文
std::string scon;
if (!seclabel_.empty()) {
scon = seclabel_;
} else {
auto result = ComputeContextFromExecutable(args_[0]);
if (!result.ok()) {
return result.error();
}
scon = *result;
}
// APEXd is always started in the "current" namespace because it is the process to set up
// the current namespace.
const bool is_apexd = args_[0] == "/system/bin/apexd";
if (!IsDefaultMountNamespaceReady() && !is_apexd) {
// If this service is started before APEXes and corresponding linker configuration
// get available, mark it as pre-apexd one. Note that this marking is
// permanent. So for example, if the service is re-launched (e.g., due
// to crash), it is still recognized as pre-apexd... for consistency.
use_bootstrap_ns_ = true;
}
// For pre-apexd services, override mount namespace as "bootstrap" one before starting.
// Note: "ueventd" is supposed to be run in "default" mount namespace even if it's pre-apexd
// to support loading firmwares from APEXes.
std::optional<MountNamespace> override_mount_namespace;
if (name_ == "ueventd") {
override_mount_namespace = NS_DEFAULT;
} else if (use_bootstrap_ns_) {
override_mount_namespace = NS_BOOTSTRAP;
}
post_data_ = ServiceList::GetInstance().IsPostData();
LOG(INFO) << "starting service '" << name_ << "'...";
std::vector<Descriptor> descriptors;
//1 根据service的约束条件创建socket
//2 将fd存放到集合中
for (const auto& socket : sockets_) {
if (auto result = socket.Create(scon); result.ok()) {
descriptors.emplace_back(std::move(*result));
} else {
LOG(INFO) << "Could not create socket '" << socket.name << "': " << result.error();
}
}
for (const auto& file : files_) {
if (auto result = file.Create(); result.ok()) {
descriptors.emplace_back(std::move(*result));
} else {
LOG(INFO) << "Could not open file '" << file.name << "': " << result.error();
}
}
pid_t pid = -1;
//1 如果命名空间存在 直接clone复制进程
//2 不存在 fork子进程
if (namespaces_.flags) {
pid = clone(nullptr, nullptr, namespaces_.flags | SIGCHLD, nullptr);
} else {
//init进程 fork子进程
pid = fork();
}
//子进程创建成功
if (pid == 0) {
umask(077);
//exec子进程的执行文件
RunService(override_mount_namespace, descriptors, std::move(pipefd));
//退出子进程
_exit(127);
}
if (pid < 0) {
pid_ = 0;
return ErrnoError() << "Failed to fork";
}
if (oom_score_adjust_ != DEFAULT_OOM_SCORE_ADJUST) {
std::string oom_str = std::to_string(oom_score_adjust_);
std::string oom_file = StringPrintf("/proc/%d/oom_score_adj", pid);
if (!WriteStringToFile(oom_str, oom_file)) {
PLOG(ERROR) << "couldn't write oom_score_adj";
}
}
time_started_ = boot_clock::now();
pid_ = pid;
flags_ |= SVC_RUNNING;
start_order_ = next_start_order_++;
process_cgroup_empty_ = false;
bool use_memcg = swappiness_ != -1 || soft_limit_in_bytes_ != -1 || limit_in_bytes_ != -1 ||
limit_percent_ != -1 || !limit_property_.empty();
errno = -createProcessGroup(proc_attr_.uid, pid_, use_memcg);
if (errno != 0) {
if (char byte = 0; write((*pipefd)[1], &byte, 1) < 0) {
return ErrnoError() << "sending notification failed";
}
return Error() << "createProcessGroup(" << proc_attr_.uid << ", " << pid_
<< ") failed for service '" << name_ << "'";
}
if (use_memcg) {
ConfigureMemcg();
}
if (oom_score_adjust_ != DEFAULT_OOM_SCORE_ADJUST) {
LmkdRegister(name_, proc_attr_.uid, pid_, oom_score_adjust_);
}
if (char byte = 1; write((*pipefd)[1], &byte, 1) < 0) {
return ErrnoError() << "sending notification failed";
}
NotifyStateChange("running");
reboot_on_failure.Disable();
return {};
}
1 查看当前服务flag查看是否可以启动服务
2 查看服务的可执行文件是否存在
3 获取服务的selinux安全上下文
4 根据rc文件service的约束条件遍历socket参数集合创建socket
根据rc文件service的约束条件遍历file参数集合创建file
5 init进程 fork子进程zygote服务
6 子进程创建完毕设置子进程的文件可操作性
7 执行子进程可执行性文件
3.2
执行可执行文件 /system/bin/app_process64
对应着app_main.cpp文件
#if defined(__LP64__)
static const char ABI_LIST_PROPERTY[] = "ro.product.cpu.abilist64";
static const char ZYGOTE_NICE_NAME[] = "zygote64";
#else
static const char ABI_LIST_PROPERTY[] = "ro.product.cpu.abilist32";
static const char ZYGOTE_NICE_NAME[] = "zygote";
#endif
int main(int argc, char* const argv[])
{
//打印app_process进程携带的参数数据
if (!LOG_NDEBUG) {
String8 argv_String;
for (int i = 0; i < argc; ++i) {
argv_String.append("\"");
argv_String.append(argv[i]);
argv_String.append("\" ");
}
ALOGV("app_process main with argv: %s", argv_String.string());
}
//创建AppRuntime对象
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// Process command line arguments
// ignore argv[0]
argc--;
argv++;
// Everything up to '--' or first non '-' arg goes to the vm.
//
// The first argument after the VM args is the "parent dir", which
// is currently unused.
//
// After the parent dir, we expect one or more the following internal
// arguments :
//
// --zygote : Start in zygote mode
// --start-system-server : Start the system server.
// --application : Start in application (stand alone, non zygote) mode.
// --nice-name : The nice name for this process.
//
// For non zygote starts, these arguments will be followed by
// the main class name. All remaining arguments are passed to
// the main method of this class.
//
// For zygote starts, all remaining arguments are passed to the zygote.
// main function.
//
// Note that we must copy argument string values since we will rewrite the
// entire argument block when we apply the nice name to argv0.
//
// As an exception to the above rule, anything in "spaced commands"
// goes to the vm even though it has a space in it.
const char* spaced_commands[] = { "-cp", "-classpath" };
// Allow "spaced commands" to be succeeded by exactly 1 argument (regardless of -s).
bool known_command = false;
int i;
for (i = 0; i < argc; i++) {
if (known_command == true) {
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add known option '%s'", argv[i]);
known_command = false;
continue;
}
for (int j = 0;
j < static_cast<int>(sizeof(spaced_commands) / sizeof(spaced_commands[0]));
++j) {
if (strcmp(argv[i], spaced_commands[j]) == 0) {
known_command = true;
ALOGV("app_process main found known command '%s'", argv[i]);
}
}
if (argv[i][0] != '-') {
break;
}
if (argv[i][1] == '-' && argv[i][2] == 0) {
++i; // Skip --.
break;
}
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add option '%s'", argv[i]);
}
// Parse runtime arguments. Stop at first unrecognized option.
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
//遍历rc文件定义的参数
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
//设置zygote昵称
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
//接下来要启动systemServer
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className.setTo(arg);
break;
} else {
--i;
break;
}
}
//会保留很多参数abiFlag 启动zygote的参数
Vector<String8> args;
if (!className.isEmpty()) {
// We're not in zygote mode, the only argument we need to pass
// to RuntimeInit is the application argument.
//
// The Remainder of args get passed to startup class main(). Make
// copies of them before we overwrite them with the process name.
args.add(application ? String8("application") : String8("tool"));
runtime.setClassNameAndArgs(className, argc - i, argv + i);
if (!LOG_NDEBUG) {
String8 restOfArgs;
char* const* argv_new = argv + i;
int argc_new = argc - i;
for (int k = 0; k < argc_new; ++k) {
restOfArgs.append("\"");
restOfArgs.append(argv_new[k]);
restOfArgs.append("\" ");
}
ALOGV("Class name = %s, args = %s", className.string(), restOfArgs.string());
}
} else {
// We're in zygote mode.
maybeCreateDalvikCache();
if (startSystemServer) {
args.add(String8("start-system-server"));
}
char prop[PROP_VALUE_MAX];
//获取系统属性ABI列表以便在运行时根据 ABI 列表加载对应的本机库和相关资源。
if (property_get(ABI_LIST_PROPERTY, prop, NULL) == 0) {
LOG_ALWAYS_FATAL("app_process: Unable to determine ABI list from property %s.",
ABI_LIST_PROPERTY);
return 11;
}
String8 abiFlag("--abi-list=");
abiFlag.append(prop);
args.add(abiFlag);
// In zygote mode, pass all remaining arguments to the zygote
// main() method.
for (; i < argc; ++i) {
args.add(String8(argv[i]));
}
}
if (!niceName.isEmpty()) {
runtime.setArgv0(niceName.string(), true /* setProcName */);
}
if (zygote) {
//当前处于zygote模式
//执行AndroidRuntime.start函数
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
1 这里遍历rc文件定义的服务参数
2 确认当前参数是否需要启动systemserver
3 获取系统属性ABI列表
4 执行androidRuntime的start函数
3.2
代码路径 AndroidRuntime.cpp
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
ALOGD(">>>>>> START %s uid %d <<<<<<\n",
className != NULL ? className : "(unknown)", getuid());
static const String8 startSystemServer("start-system-server");
// Whether this is the primary zygote, meaning the zygote which will fork system server.
bool primary_zygote = false;
/*
* 'startSystemServer == true' means runtime is obsolete and not run from
* init.rc anymore, so we print out the boot start event here.
*/
//遍历当前参数列表等于startSystemServer
//设置 primary_zygote = true 打印log
for (size_t i = 0; i < options.size(); ++i) {
if (options[i] == startSystemServer) {
primary_zygote = true;
/* track our progress through the boot sequence */
const int LOG_BOOT_PROGRESS_START = 3000;
LOG_EVENT_LONG(LOG_BOOT_PROGRESS_START, ns2ms(systemTime(SYSTEM_TIME_MONOTONIC)));
}
}
//系统目录从环境变量ANDROID_ROOT中读取。如果说去失败,
//则默认设置目录为"/system"。如果连"/system"也没有,则Zygote进程会退出
const char* rootDir = getenv("ANDROID_ROOT");
if (rootDir == NULL) {
rootDir = "/system";
if (!hasDir("/system")) {
LOG_FATAL("No root directory specified, and /system does not exist.");
return;
}
setenv("ANDROID_ROOT", rootDir, 1);
}
//art根目录不存在 退出zygote进程
const char* artRootDir = getenv("ANDROID_ART_ROOT");
if (artRootDir == NULL) {
LOG_FATAL("No ART directory specified with ANDROID_ART_ROOT environment variable.");
return;
}
//i18nRootDir根目录不存在 退出zygote进程
const char* i18nRootDir = getenv("ANDROID_I18N_ROOT");
if (i18nRootDir == NULL) {
LOG_FATAL("No runtime directory specified with ANDROID_I18N_ROOT environment variable.");
return;
}
const char* tzdataRootDir = getenv("ANDROID_TZDATA_ROOT");
if (tzdataRootDir == NULL) {
LOG_FATAL("No tz data directory specified with ANDROID_TZDATA_ROOT environment variable.");
return;
}
//const char* kernelHack = getenv("LD_ASSUME_KERNEL");
//ALOGD("Found LD_ASSUME_KERNEL='%s'\n", kernelHack);
/* start the virtual machine */
JniInvocation jni_invocation;
jni_invocation.Init(NULL);
JNIEnv* env;
//启动VM虚拟机
if (startVm(&mJavaVM, &env, zygote, primary_zygote) != 0) {
return;
}
onVmCreated(env);
/*
* Register android functions.
*/
//加载jni文件
if (startReg(env) < 0) {
ALOGE("Unable to register all android natives\n");
return;
}
/*
* We want to call main() with a String array with arguments in it.
* At present we have two arguments, the class name and an option string.
* Create an array to hold them.
*/
jclass stringClass;
jobjectArray strArray;
jstring classNameStr;
//获取java类
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
//创建java数组
strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
//给java数组设置java参数
env->SetObjectArrayElement(strArray, 0, classNameStr);
for (size_t i = 0; i < options.size(); ++i) {
jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
assert(optionsStr != NULL);
env->SetObjectArrayElement(strArray, i + 1, optionsStr);
}
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
//获取java ClassName
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
//执行zygoteInit.java的main方法 这里来到了java层
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
free(slashClassName);
ALOGD("Shutting down VM\n");
if (mJavaVM->DetachCurrentThread() != JNI_OK)
ALOGW("Warning: unable to detach main thread\n");
if (mJavaVM->DestroyJavaVM() != 0)
ALOGW("Warning: VM did not shut down cleanly\n");
}
1 遍历参数查看是否启动systemserver
2 读取ANDROID_ROOT根目录环境变量查看当前是否有system目录
3 读取一些根目录 确认可以启动zygote进程
4 启动VM虚拟机
5 加载系统定义的JNI类文件资源
6 调用系统JNI方法将c参数转为java参数
7 执行zygoteInit.main方法
4 zygote java篇
代码路径 ZygoteInit.java
public static void main(String[] argv) {
ZygoteServer zygoteServer = null;
// Mark zygote start. This ensures that thread creation will throw
// an error.
ZygoteHooks.startZygoteNoThreadCreation();
// Zygote goes into its own process group.
try {
Os.setpgid(0, 0);
} catch (ErrnoException ex) {
throw new RuntimeException("Failed to setpgid(0,0)", ex);
}
Runnable caller;
try {
// Store now for StatsLogging later.
//获取系统开始时间
final long startTime = SystemClock.elapsedRealtime();
//获取sys.boot_completed系统属性
//确认系统是否启动完成
final boolean isRuntimeRestarted = "1".equals(
SystemProperties.get("sys.boot_completed"));
//判断当前进程时zygote32 还是zygote64
String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
Trace.TRACE_TAG_DALVIK);
bootTimingsTraceLog.traceBegin("ZygoteInit");
//fork之前的初始化操作
RuntimeInit.preForkInit();
boolean startSystemServer = false;
String zygoteSocketName = "zygote";
String abiList = null;
boolean enableLazyPreload = false;
//遍历参数确定当前是否启动SystemServer
//是否开启懒加载
//是否包含abi_list参数
//获取zygote服务端socket
for (int i = 1; i < argv.length; i++) {
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
//判断当前zygote是主还是副
final boolean isPrimaryZygote = zygoteSocketName.equals(Zygote.PRIMARY_SOCKET_NAME);
if (!isRuntimeRestarted) {
//如果系统还未启动完成将这些数据写入系统日志
if (isPrimaryZygote) {
FrameworkStatsLog.write(FrameworkStatsLog.BOOT_TIME_EVENT_ELAPSED_TIME_REPORTED,
BOOT_TIME_EVENT_ELAPSED_TIME__EVENT__ZYGOTE_INIT_START,
startTime);
} else if (zygoteSocketName.equals(Zygote.SECONDARY_SOCKET_NAME)) {
FrameworkStatsLog.write(FrameworkStatsLog.BOOT_TIME_EVENT_ELAPSED_TIME_REPORTED,
BOOT_TIME_EVENT_ELAPSED_TIME__EVENT__SECONDARY_ZYGOTE_INIT_START,
startTime);
}
}
//abiList为空抛出异常
if (abiList == null) {
throw new RuntimeException("No ABI list supplied.");
}
// In some configurations, we avoid preloading resources and classes eagerly.
// In such cases, we will preload things prior to our first fork.
//如果当前不是懒加载
if (!enableLazyPreload) {
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
//预加载系统类和资源
preload(bootTimingsTraceLog);
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload
}
// Do an initial gc to clean up after startup
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
Zygote.initNativeState(isPrimaryZygote);
ZygoteHooks.stopZygoteNoThreadCreation();
zygoteServer = new ZygoteServer(isPrimaryZygote);
if (startSystemServer) {
//fork systemServer进程
Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
Log.i(TAG, "Accepting command socket connections");
// The select loop returns early in the child process after a fork and
// loops forever in the zygote.
//无线循环等待处理消息
caller = zygoteServer.runSelectLoop(abiList);
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with fatal exception", ex);
throw ex;
} finally {
if (zygoteServer != null) {
zygoteServer.closeServerSocket();
}
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
}
}
1 获取系统时间 确认系统是否启动完成
2 遍历参数
启动systemserver
开启懒加载
获取abiList
获取zygote的socket名字
3 预加载系统类,资源文件,lib库,OpenGL等等
4 fork systemServer
5 zogote进程 循环处理其他进程消息
4 zygote如何fork systemServer进程
代码路径zygoteInit.java
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
//父进程fork子进程传递的能力
long capabilities = posixCapabilitiesAsBits(
OsConstants.CAP_IPC_LOCK,
OsConstants.CAP_KILL,
OsConstants.CAP_NET_ADMIN,
OsConstants.CAP_NET_BIND_SERVICE,
OsConstants.CAP_NET_BROADCAST,
OsConstants.CAP_NET_RAW,
OsConstants.CAP_SYS_MODULE,
OsConstants.CAP_SYS_NICE,
OsConstants.CAP_SYS_PTRACE,
OsConstants.CAP_SYS_TIME,
OsConstants.CAP_SYS_TTY_CONFIG,
OsConstants.CAP_WAKE_ALARM,
OsConstants.CAP_BLOCK_SUSPEND
);
/* Containers run without some capabilities, so drop any caps that are not available. */
StructCapUserHeader header = new StructCapUserHeader(
OsConstants._LINUX_CAPABILITY_VERSION_3, 0);
StructCapUserData[] data;
try {
//包含了当前进程的所有 POSIX 能力信息。
data = Os.capget(header);
} catch (ErrnoException ex) {
throw new RuntimeException("Failed to capget()", ex);
}
capabilities &= ((long) data[0].effective) | (((long) data[1].effective) << 32);
/* Hardcoded command line to start the system server */
String[] args = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,"
+ "1024,1032,1065,3001,3002,3003,3005,3006,3007,3009,3010,3011,3012",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"--target-sdk-version=" + VMRuntime.SDK_VERSION_CUR_DEVELOPMENT,
"com.android.server.SystemServer",
};
ZygoteArguments parsedArgs;
int pid;
try {
ZygoteCommandBuffer commandBuffer = new ZygoteCommandBuffer(args);
try {
//解析fork systemserver 的 String[] args的参数
parsedArgs = ZygoteArguments.getInstance(commandBuffer);
} catch (EOFException e) {
throw new AssertionError("Unexpected argument error for forking system server", e);
}
commandBuffer.close();
Zygote.applyDebuggerSystemProperty(parsedArgs);
Zygote.applyInvokeWithSystemProperty(parsedArgs);
//1 当前进程是否支持内存标记
//2 支持的话获取arm64.memtag.process.system_server属性值
//3 aync 设置flag Zygote.MEMORY_TAG_LEVEL_ASYNC
//4 sync 设置flag Zygote.MEMORY_TAG_LEVEL_SYNC
//5 如果设备不支持内存标记,那么会检查设备是否支持带标记指针
if (Zygote.nativeSupportsMemoryTagging()) {
String mode = SystemProperties.get("arm64.memtag.process.system_server", "");
if (mode.isEmpty()) {
/* The system server has ASYNC MTE by default, in order to allow
* system services to specify their own MTE level later, as you
* can't re-enable MTE once it's disabled. */
mode = SystemProperties.get("persist.arm64.memtag.default", "async");
}
if (mode.equals("async")) {
parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_ASYNC;
} else if (mode.equals("sync")) {
parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_SYNC;
} else if (!mode.equals("off")) {
/* When we have an invalid memory tag level, keep the current level. */
parsedArgs.mRuntimeFlags |= Zygote.nativeCurrentTaggingLevel();
Slog.e(TAG, "Unknown memory tag level for the system server: \"" + mode + "\"");
}
} else if (Zygote.nativeSupportsTaggedPointers()) {
/* Enable pointer tagging in the system server. Hardware support for this is present
* in all ARMv8 CPUs. */
parsedArgs.mRuntimeFlags |= Zygote.MEMORY_TAG_LEVEL_TBI;
}
/* Enable gwp-asan on the system server with a small probability. This is the same
* policy as applied to native processes and system apps. */
parsedArgs.mRuntimeFlags |= Zygote.GWP_ASAN_LEVEL_LOTTERY;
if (shouldProfileSystemServer()) {
parsedArgs.mRuntimeFlags |= Zygote.PROFILE_SYSTEM_SERVER;
}
/* Request to fork the system server process */
//zygote fork systemServer进程
pid = Zygote.forkSystemServer(
parsedArgs.mUid, parsedArgs.mGid,
parsedArgs.mGids,
parsedArgs.mRuntimeFlags,
null,
parsedArgs.mPermittedCapabilities,
parsedArgs.mEffectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
return handleSystemServerProcess(parsedArgs);
}
return null;
}
1 定义解析好启动systemserver参数数组
2 fork systemserver进程
3 systemserver进程关闭zegote进程的socket
4 处理systemServer进程handleSystemServerProcess
4.1 fork systemserver进程
Zygote.java
static int forkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
//主要功能是停止Zygote的4个Daemon子线程的运行,
//等待并确保Zygote的单线程(用于fork效率),并等待这些线程的停止,初始化gc堆的工作
ZygoteHooks.preFork();
//调JNI fork systemServer
int pid = nativeForkSystemServer(
uid, gid, gids, runtimeFlags, rlimits,
permittedCapabilities, effectiveCapabilities);
// Set the Java Language thread priority to the default value for new apps.
Thread.currentThread().setPriority(Thread.NORM_PRIORITY);
ZygoteHooks.postForkCommon();
return pid;
}
1 停止zygote的4个子进程来保证fork的高效
2 调用native来fork进程
com_android_internal_os_Zygote.cpp
static jint com_android_internal_os_Zygote_nativeForkSystemServer(
JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
jint runtime_flags, jobjectArray rlimits, jlong permitted_capabilities,
jlong effective_capabilities) {
..............
//fork进程
pid_t pid = zygote::ForkCommon(env, true,
fds_to_close,
fds_to_ignore,
true);
if (pid == 0) {
//当前在子进程systemserver
// System server prcoess does not need data isolation so no need to
// know pkg_data_info_list.
//调用SpecializeCommon()函数进行一些特定操作,
//包括设置用户ID(uid)、组ID(gid)、附加组ID(gids)、
//运行时标志(runtime_flags)、资源限制(rlimits)、
//权限能力(permitted_capabilities、effective_capabilities)等。
//此外,还会进行一些其他操作,如挂载外部存储、初始化应用程序数据等。
SpecializeCommon(env, uid, gid, gids, runtime_flags, rlimits, permitted_capabilities,
effective_capabilities, MOUNT_EXTERNAL_DEFAULT, nullptr, nullptr, true,
false, nullptr, nullptr, /* is_top_app= */ false,
/* pkg_data_info_list */ nullptr,
/* allowlisted_data_info_list */ nullptr, false, false);
} else if (pid > 0) {
//当前在zygote进程中
// The zygote process checks whether the child process has died or not.
ALOGI("System server process %d has been created", pid);
gSystemServerPid = pid;
// There is a slight window that the system server process has crashed
// but it went unnoticed because we haven't published its pid yet. So
// we recheck here just to make sure that all is well.
int status;
//如果System server 进程挂掉 重启zygote
if (waitpid(pid, &status, WNOHANG) == pid) {
ALOGE("System server process %d has died. Restarting Zygote!", pid);
RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");
}
if (UsePerAppMemcg()) {
// Assign system_server to the correct memory cgroup.
// Not all devices mount memcg so check if it is mounted first
// to avoid unnecessarily printing errors and denials in the logs.
//SetTaskProfiles()函数将系统服务进程分配到正确的内存控制组
//(Memory Cgroup)中,以实现对系统服务进程的内存管理。
if (!SetTaskProfiles(pid, std::vector<std::string>{"SystemMemoryProcess"})) {
ALOGE("couldn't add process %d into system memcg group", pid);
}
}
}
return pid;
}
1 通过zygotefork进程
2 fork成功 pid =0 子进程进行自身的设置例如用户id,用户组id等等
还会获取systemserver的类加载器
3 如果pid>0代表当前在父进程,如果systemServer挂掉,zygote进程重启
具体而言,在 Zygote 进程中调用 zygote::ForkCommon() 函数时,pid 的值会被设置为 SystemServer 进程的进程号。也就是说,对于 Zygote 进程来说,pid 将变为 SystemServer 进程的进程号。
而对于 SystemServer 进程来说,由于是通过 Fork 操作创建的新进程,pid 的值将会是 0。这是因为 fork() 系统调用在子进程中返回 0。
因此,在代码中可以使用 pid 的值来区分 Zygote 进程和 SystemServer 进程,从而执行不同的逻辑。在这种情况下,如果 pid 是 0,则说明当前代码在 SystemServer 进程中执行;如果 pid 不为 0,则说明当前代码在 Zygote 进程中执行。
4.2 systemServer进程处理handleSystemServerProcess
private static Runnable handleSystemServerProcess(ZygoteArguments parsedArgs) {
// set umask to 0077 so new files and directories will default to owner-only permissions.
Os.umask(S_IRWXG | S_IRWXO);
if (parsedArgs.mNiceName != null) {
Process.setArgV0(parsedArgs.mNiceName);
}
//首先,通过调用Os.getenv("SYSTEMSERVERCLASSPATH")
//来获取环境变量SYSTEMSERVERCLASSPATH的值。如果该环境变量存在,
//说明需要对系统服务进程进行性能分析配置。
final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
// Capturing profiles is only supported for debug or eng builds since selinux normally
// prevents it.
if (shouldProfileSystemServer() && (Build.IS_USERDEBUG || Build.IS_ENG)) {
try {
Log.d(TAG, "Preparing system server profile");
prepareSystemServerProfile(systemServerClasspath);
} catch (Exception e) {
Log.wtf(TAG, "Failed to set up system server profile", e);
}
}
}
if (parsedArgs.mInvokeWith != null) {
String[] args = parsedArgs.mRemainingArgs;
// If we have a non-null system server class path, we'll have to duplicate the
// existing arguments and append the classpath to it. ART will handle the classpath
// correctly when we exec a new process.
if (systemServerClasspath != null) {
String[] amendedArgs = new String[args.length + 2];
amendedArgs[0] = "-cp";
amendedArgs[1] = systemServerClasspath;
System.arraycopy(args, 0, amendedArgs, 2, args.length);
args = amendedArgs;
}
WrapperInit.execApplication(parsedArgs.mInvokeWith,
parsedArgs.mNiceName, parsedArgs.mTargetSdkVersion,
VMRuntime.getCurrentInstructionSet(), null, args);
throw new IllegalStateException("Unexpected return from WrapperInit.execApplication");
} else {
ClassLoader cl = getOrCreateSystemServerClassLoader();
if (cl != null) {
Thread.currentThread().setContextClassLoader(cl);
}
/*
* Pass the remaining arguments to SystemServer.
*/
return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
parsedArgs.mDisabledCompatChanges,
parsedArgs.mRemainingArgs, cl);
}
/* should never reach here */
}
1 查看SYSTEMSERVERCLASSPATH环境变量是否存在
存在则准备启动systemserver进程性能分析
2 获取systemserver的classloader
执行zygoteInit方法
public static Runnable zygoteInit(int targetSdkVersion, long[] disabledCompatChanges,
String[] argv, ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, disabledCompatChanges, argv,
classLoader);
}
1 重定向log输入流
2 systemserver进程如何执行到systemServer的main方法
protected static Runnable applicationInit(int targetSdkVersion, long[] disabledCompatChanges,
String[] argv, ClassLoader classLoader) {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
nativeSetExitWithoutCleanup(true);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
VMRuntime.getRuntime().setDisabledCompatChanges(disabledCompatChanges);
final Arguments args = new Arguments(argv);
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// Remaining arguments are passed to the start class's static main
//执行systemserver的main方法
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
3 zygote进程后续如何与其他进程通信fork子进程
//无线循环等待处理消息
caller = zygoteServer.runSelectLoop(abiList);
Runnable runSelectLoop(String abiList) {
ArrayList<FileDescriptor> socketFDs = new ArrayList<>();
ArrayList<ZygoteConnection> peers = new ArrayList<>();
socketFDs.add(mZygoteSocket.getFileDescriptor());
peers.add(null);
mUsapPoolRefillTriggerTimestamp = INVALID_TIMESTAMP;
while (true) {
//USAP 池的策略属性
fetchUsapPoolPolicyPropsWithMinInterval();
mUsapPoolRefillAction = UsapPoolRefillAction.NONE;
//USAP 管道文件描述符的数组
int[] usapPipeFDs = null;
//存储需要监听的文件描述符的状态变化
StructPollfd[] pollFDs;
// Allocate enough space for the poll structs, taking into account
// the state of the USAP pool for this Zygote (could be a
// regular Zygote, a WebView Zygote, or an AppZygote).
//如果usap池开启
if (mUsapPoolEnabled) {
//epoll监听socketfd和usappipefd
usapPipeFDs = Zygote.getUsapPipeFDs();
pollFDs = new StructPollfd[socketFDs.size() + 1 + usapPipeFDs.length];
} else {
pollFDs = new StructPollfd[socketFDs.size()];
}
/*
* For reasons of correctness the USAP pool pipe and event FDs
* must be processed before the session and server sockets. This
* is to ensure that the USAP pool accounting information is
* accurate when handling other requests like API deny list
* exemptions.
*/
int pollIndex = 0;
//遍历socket fd 添加到epoll中
for (FileDescriptor socketFD : socketFDs) {
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = socketFD;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
}
final int usapPoolEventFDIndex = pollIndex;
//如果usap开启遍历pipe fd 添加到epoll
if (mUsapPoolEnabled) {
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = mUsapPoolEventFD;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
// The usapPipeFDs array will always be filled in if the USAP Pool is enabled.
assert usapPipeFDs != null;
for (int usapPipeFD : usapPipeFDs) {
FileDescriptor managedFd = new FileDescriptor();
managedFd.setInt$(usapPipeFD);
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = managedFd;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
}
}
int pollTimeoutMs;
//这里确定epoll唤醒时间
if (mUsapPoolRefillTriggerTimestamp == INVALID_TIMESTAMP) {
pollTimeoutMs = -1;
} else {
long elapsedTimeMs = System.currentTimeMillis() - mUsapPoolRefillTriggerTimestamp;
if (elapsedTimeMs >= mUsapPoolRefillDelayMs) {
// The refill delay has elapsed during the period between poll invocations.
// We will now check for any currently ready file descriptors before refilling
// the USAP pool.
pollTimeoutMs = 0;
mUsapPoolRefillTriggerTimestamp = INVALID_TIMESTAMP;
mUsapPoolRefillAction = UsapPoolRefillAction.DELAYED;
} else if (elapsedTimeMs <= 0) {
// This can occur if the clock used by currentTimeMillis is reset, which is
// possible because it is not guaranteed to be monotonic. Because we can't tell
// how far back the clock was set the best way to recover is to simply re-start
// the respawn delay countdown.
pollTimeoutMs = mUsapPoolRefillDelayMs;
} else {
pollTimeoutMs = (int) (mUsapPoolRefillDelayMs - elapsedTimeMs);
}
}
int pollReturnValue;
try {
//监听文件描述符的状态变化
pollReturnValue = Os.poll(pollFDs, pollTimeoutMs);
} catch (ErrnoException ex) {
throw new RuntimeException("poll failed", ex);
}
//0表示 poll() 方法返回时没有任何文件描述符就绪。
//这可能是因为超时时间已经到达,或者使用了非阻塞模式进行的轮询,
//但没有任何文件描述符准备好。在这种情况下,需要触发 USAP 池的重填
if (pollReturnValue == 0) {
// The poll returned zero results either when the timeout value has been exceeded
// or when a non-blocking poll is issued and no FDs are ready. In either case it
// is time to refill the pool. This will result in a duplicate assignment when
// the non-blocking poll returns zero results, but it avoids an additional
// conditional in the else branch.
mUsapPoolRefillTriggerTimestamp = INVALID_TIMESTAMP;
mUsapPoolRefillAction = UsapPoolRefillAction.DELAYED;
} else {
//epoll监听的fd都已准备就绪
boolean usapPoolFDRead = false;
//遍历监听的fd确认是否有可读事件
while (--pollIndex >= 0) {
if ((pollFDs[pollIndex].revents & POLLIN) == 0) {
continue;
}
if (pollIndex == 0) {
//处理 Zygote 服务器接收到的客户端连接请求,并准备好下一次轮询事件
//如果监听的fd都没有可读事件
// Zygote server socket
//接受客户端socket连接
ZygoteConnection newPeer = acceptCommandPeer(abiList);
peers.add(newPeer);
//添加到socketfd集合中
//等待下次重新加载socket fd 再来监听
socketFDs.add(newPeer.getFileDescriptor());
} else if (pollIndex < usapPoolEventFDIndex) {
// Session socket accepted from the Zygote server socket
//这个条件当前epoll监听的fd不包含usapPoolEventFD
try {
//通过pollIndex确定是哪一个client与zygote的连接
ZygoteConnection connection = peers.get(pollIndex);
//是否支持多fork
boolean multipleForksOK = !isUsapPoolEnabled()
&& ZygoteHooks.isIndefiniteThreadSuspensionSafe();
//这里来执行command 生成runnable对象
//这个runnable对象是留给fork的子进程执行的
final Runnable command =
connection.processCommand(this, multipleForksOK);
// TODO (chriswailes): Is this extra check necessary?
if (mIsForkChild) {
// We're in the child. We should always have a command to run at
// this stage if processCommand hasn't called "exec".
if (command == null) {
throw new IllegalStateException("command == null");
}
return command;
} else {
//这里是一些异常情况
//zygote服务端根本不需要有command
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// We don't know whether the remote side of the socket was closed or
// not until we attempt to read from it from processCommand. This
// shows up as a regular POLLIN event in our regular processing
// loop.
//考虑是否关闭当前这个socket连接
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(pollIndex);
socketFDs.remove(pollIndex);
}
}
} catch (Exception e) {
//异常处理
if (!mIsForkChild) {
//不是fork子进程 移除指定socketfd
// We're in the server so any exception here is one that has taken
// place pre-fork while processing commands or reading / writing
// from the control socket. Make a loud noise about any such
// exceptions so that we know exactly what failed and why.
Slog.e(TAG, "Exception executing zygote command: ", e);
// Make sure the socket is closed so that the other end knows
// immediately that something has gone wrong and doesn't time out
// waiting for a response.
ZygoteConnection conn = peers.remove(pollIndex);
conn.closeSocket();
socketFDs.remove(pollIndex);
} else {
// We're in the child so any exception caught here has happened post
// fork and before we execute ActivityThread.main (or any other
// main() method). Log the details of the exception and bring down
// the process.
//fork子进程情况 抛异常
Log.e(TAG, "Caught post-fork exception in child process.", e);
throw e;
}
} finally {
// Reset the child flag, in the event that the child process is a child-
// zygote. The flag will not be consulted this loop pass after the
// Runnable is returned.
mIsForkChild = false;
}
}
1 将当前zygote server socket加入集合中
2 判断是否将usapPoolfd 加入集合
3 将相关fd给到epoll监测
4 遍历监听的fd
5 如果没有可读事件 检查zygote服务端是否存在client连接
重新在扫描将fd注册给epoll
6 有可读事件执行processcommand
进行子进程fork操作 生成command runnable函数
7 子进程执行command runnable函数
5 zygote时序图
6 整体时许图
问题1
zygote进程fork的子进程 死掉后 zygote为什么可以重新拉起他
在 Android 系统中,Zygote 进程是一个非常特殊的进程。它是 Android 系统中所有应用程序进程的"父进程",也是所有 Java 进程的"父进程"。当 Zygote 进程 fork 出一个子进程时,该子进程会通过一定方式(例如 socket 通信)与 Zygote 进程建立起联系,并将自己注册到 Zygote 进程维护的进程列表中。此后,如果该子进程异常退出,Zygote 进程会检测到该事件,并根据相应的策略重新拉起该进程。
具体来说,Zygote 进程会周期性地扫描其维护的进程列表,并检查每个进程的状态。如果发现某个进程异常退出,它会根据该进程的启动参数及其对应的 APK 文件信息,重新启动该进程。这个过程中会重新执行 Zygote 启动的流程,创建新的进程并让其载入 APK 文件,最终完成进程的重启。
需要注意的是,Zygote 进程的重启并不是简单地调用 fork 或 exec 等系统调用来创建新的进程,而是要重新执行 Zygote 的启动流程,包括环境变量的设置、文件描述符的继承、内存映射的设置等。这样做的目的是确保重启的进程和之前异常退出的进程在启动环境上保持一致,从而避免出现不可预知的问题。
问题2 zygote进程一般和init进程通信做些什么除了zygote死亡外
在 Android 系统中,Zygote 进程和 init 进程之间通常会进行一些重要的通信,除了处理 Zygote 进程死亡的情况外,还有以下一些常见的通信操作:
进程启动:Zygote 进程负责 fork 出新的应用程序进程,并在这个过程中需要与 init 进程进行通信。例如,当一个新应用程序进程被创建时,Zygote 进程会通过 socket 或 binder 机制向 init 进程发送请求,请求 init 进程帮助完成新进程的初始化工作。
进程终止:当应用程序进程终止时,Zygote 进程也会向 init 进程发送通知,以便让 init 进程做一些清理工作或者记录相关信息。
状态汇报:Zygote 进程可能会定期向 init 进程发送状态信息,包括自身的运行状态、已经创建的应用程序进程数量等等。
系统服务交互:在 Android 系统中,init 进程负责启动和管理系统服务。Zygote 进程可能需要与 init 进程通信以获取系统服务的状态或者向 init 进程注册自己所提供的系统服务。
总的来说,Zygote 进程和 init 进程之间的通信主要是为了完成 Android 系统的初始化工作、应用程序进程的创建和管理等功能。这种通信机制是 Android 系统启动和运行的重要基础,确保了系统各个部分之间的协同工作。