Android崩溃原理和优化

一、Java Crash处理

1、在Thread类中有这样一个接口:UncaughtExceptionHandler。

通过查看相关注释可以知道:当线程由于未捕获的异常突然终止时,JVM会通过getUnaughtExceptionHandler查询线程的UnaughtExceptionHandler,并调用它的uncaughtException方法。如果未设置UncaughtExceptionHandler,系统会用ThreadGroup进行处理。

/**
 * Interface for handlers invoked when a <tt>Thread</tt> abruptly
 * terminates due to an uncaught exception.
 * <p>When a thread is about to terminate due to an uncaught exception
 * the Java Virtual Machine will query the thread for its
 * <tt>UncaughtExceptionHandler</tt> using
 * {@link #getUncaughtExceptionHandler} and will invoke the handler's
 * <tt>uncaughtException</tt> method, passing the thread and the
 * exception as arguments.
 * If a thread has not had its <tt>UncaughtExceptionHandler</tt>
 * explicitly set, then its <tt>ThreadGroup</tt> object acts as its
 * <tt>UncaughtExceptionHandler</tt>. If the <tt>ThreadGroup</tt> object
 * has no
 * special requirements for dealing with the exception, it can forward
 * the invocation to the {@linkplain #getDefaultUncaughtExceptionHandler
 * default uncaught exception handler}.
 */
@FunctionalInterface
public interface UncaughtExceptionHandler {
    /**
     * Method invoked when the given thread terminates due to the
     * given uncaught exception.
     * <p>Any exception thrown by this method will be ignored by the
     * Java Virtual Machine.
     * @param t the thread
     * @param e the exception
     */
    void uncaughtException(Thread t, Throwable e);
}

查看ThreadGroup的uncaughtException,它会查询线程设置的UnaughtExceptionHandler,如果没有的话,只是进行打印处理,并没有退出操作。说明一定有其他地方对Thread设置了UnaughtExceptionHandler。

/**
 * Called by the Java Virtual Machine when a thread in this
 * thread group stops because of an uncaught exception, and the thread
 * does not have a specific {@link Thread.UncaughtExceptionHandler}
 * installed.
 * <p>
 * The <code>uncaughtException</code> method of
 * <code>ThreadGroup</code> does the following:
 * <ul>
 * <li>If this thread group has a parent thread group, the
 *     <code>uncaughtException</code> method of that parent is called
 *     with the same two arguments.
 * <li>Otherwise, this method checks to see if there is a
 *     {@linkplain Thread#getDefaultUncaughtExceptionHandler default
 *     uncaught exception handler} installed, and if so, its
 *     <code>uncaughtException</code> method is called with the same
 *     two arguments.
 * <li>Otherwise, this method determines if the <code>Throwable</code>
 *     argument is an instance of {@link ThreadDeath}. If so, nothing
 *     special is done. Otherwise, a message containing the
 *     thread's name, as returned from the thread's {@link
 *     Thread#getName getName} method, and a stack backtrace,
 *     using the <code>Throwable</code>'s {@link
 *     Throwable#printStackTrace printStackTrace} method, is
 *     printed to the {@linkplain System#err standard error stream}.
 * </ul>
 * <p>
 * Applications can override this method in subclasses of
 * <code>ThreadGroup</code> to provide alternative handling of
 * uncaught exceptions.
 *
 * @param   t   the thread that is about to exit.
 * @param   e   the uncaught exception.
 * @since   JDK1.0
 */
public void uncaughtException(Thread t, Throwable e) {
    if (parent != null) {
        parent.uncaughtException(t, e);
    } else {
        Thread.UncaughtExceptionHandler ueh =
            Thread.getDefaultUncaughtExceptionHandler();
        if (ueh != null) {
            ueh.uncaughtException(t, e);
        } else if (!(e instanceof ThreadDeath)) {
            System.err.print("Exception in thread \""
                             + t.getName() + "\" ");
            e.printStackTrace(System.err);
        }
    }
}
2、Thread的UncaughtExceptionHandler何时设置的?

通过AMS-Activity启动流程,我们可以知道App启动大概要经历以下步骤:

Android开机流程

在RuntimeInit.commonInit()方法中,会通过Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler)) 设置异常处理的handler。

protected static final void commonInit() {
    if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
    /*
     * set handlers; these apply to all threads in the VM. Apps can replace
     * the default handler, but not the pre handler.
     */
    LoggingHandler loggingHandler = new LoggingHandler();
    RuntimeHooks.setUncaughtExceptionPreHandler(loggingHandler);
    Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
    /*
     * Install a time zone supplier that uses the Android persistent time zone system property.
     */
    RuntimeHooks.setTimeZoneIdSupplier(() -> SystemProperties.get("persist.sys.timezone"));
    /*
     * Sets handler for java.util.logging to use Android log facilities.
     * The odd "new instance-and-then-throw-away" is a mirror of how
     * the "java.util.logging.config.class" system property works. We
     * can't use the system property here since the logger has almost
     * certainly already been initialized.
     */
    LogManager.getLogManager().reset();
    new AndroidConfig();
    /*
     * Sets the default HTTP User-Agent used by HttpURLConnection.
     */
    String userAgent = getDefaultUserAgent();
    System.setProperty("http.agent", userAgent);
    /*
     * Wire socket tagging to traffic stats.
     */
    NetworkManagementSocketTagger.install();
    initialized = true;
}
3、崩溃的源头:KillApplicationHandler

查看源码可知,在finally中,KillApplicationHandler主动杀死了进程。

private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
    public void uncaughtException(Thread t, Throwable e) {
        try {
            ensureLogging(t, e);
            if (mCrashing) return;
            mCrashing = true;
            if (ActivityThread.currentActivityThread() != null) {
                ActivityThread.currentActivityThread().stopProfiling();
            }
            ActivityManager.getService().handleApplicationCrash(
                    mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
        } catch (Throwable t2) {
            ...
        } finally {
            // Try everything to make sure this process goes away.
            Process.killProcess(Process.myPid());
            System.exit(10);
        }
    }
}
4、KillApplicationHandler中的其他操作

在uncaughtException中,通过AMS.handleApplicationCrash()做了进一步处理。通过addErrorToDropBox()在系统中记录日志,可以记录 java crash、native crash、anr等,日志目录是:/data/system/dropbox

public void handleApplicationCrash(IBinder app,
        ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
    ProcessRecord r = findAppProcess(app, "Crash");
    final String processName = app == null ? "system_server"
            : (r == null ? "unknown" : r.processName);
    handleApplicationCrashInner("crash", r, processName, crashInfo);
}

void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
        ApplicationErrorReport.CrashInfo crashInfo) {
    ...
    addErrorToDropBox(
            eventType, r, processName, null, null, null, null, null, null, crashInfo,
            new Float(loadingProgress), incrementalMetrics, null);
    mAppErrors.crashApplication(r, crashInfo);
}
5、Android 处理Java Crash的调用流程
未捕获的异常 -> JVM 触发调用 ->
KillApplicationHandler.uncaughtException {
    try {
        ActivityManager.getService().handleApplicationCrash();  // 交给AMS处理
    } finally { // 退出App进程
        Process.killProcess(Process.myPid());
        System.exit(10);
    }
}
    -> AMS.handleApplicationCrash
    -> AMS.handleApplicationCrashInner {
        addErrorToDropBox(); // 系统记录崩溃日志
        mAppErrors.crashApplication();
    }
        -> AppErrors.crashApplication
        -> AppErrors.crashApplicationInner {
            // 处理crash
            if (!makeAppCrashingLocked()){
                return;
            }

            // 展示崩溃弹窗
            final Message msg = Message.obtain();
            msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;
            mService.mUiHandler.sendMessage(msg);

            // 处理弹窗结果,重启、退出等
            int res = result.get(); // 阻塞
            switch (res) {}
        }

二、native crash处理

native crash处理流程
1、java层监听

Binder(五)服务注册流程-发送注册请求可知:
手机开机后会启动system_server进程,然后调用SystemServer的main方法,在main方法中通过startBootstrapServices启动AMS。之后通过startOtherServices方法调用AMS的systemReady ,在systemReady的回调中,会通过 mActivityManagerService.startObservingNativeCrashes() 注册 native crash 的监听。

在NativeCrashListener的run方法中,开启了socket监听。

public void startObservingNativeCrashes() {
    final NativeCrashListener ncl = new NativeCrashListener(this);
    ncl.start();
}

final class NativeCrashListener extends Thread {
    public void run() {
        final byte[] ackSignal = new byte[1];
        ...
        try {
            FileDescriptor serverFd = Os.socket(AF_UNIX, SOCK_STREAM, 0);
            final UnixSocketAddress sockAddr = UnixSocketAddress.createFileSystem(DEBUGGERD_SOCKET_PATH);
            Os.bind(serverFd, sockAddr);
            Os.listen(serverFd, 1);
            Os.chmod(DEBUGGERD_SOCKET_PATH, 0777);
            while (true) {
                FileDescriptor peerFd = null;
                try {
                    peerFd = Os.accept(serverFd, null /* peerAddress */);
                    if (peerFd != null) {
                        consumeNativeCrashData(peerFd);
                    }
                } catch (Exception e) {
                    ...
                } finally {
                    ...
                }
            }
        } catch (Exception e) {
            ...
        }
    }
}
2、native上报

native程序是动态链接程序,需要链接器才能跑起来,liner就是Android的链接器,查看linker_main.cpp。经过一系列调用 _linker_init -> _linker_init_post_relocation -> debuggerd_init 进入debuggerd_handler.cpp的debuggerd_init方法中。

/* This is the entry point for the linker, called from begin.S. This
 * method is responsible for fixing the linker's own relocations, and
 * then calling __linker_init_post_relocation().
 */
extern "C" ElfW(Addr) __linker_init(void* raw_args) {
    ...
    ElfW(Addr) start_address = __linker_init_post_relocation(args);
    return start_address;
}

static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args) {
#ifdef __ANDROID__
    debuggerd_callbacks_t callbacks = {
        .get_abort_message = []() {
        return g_abort_message;
        },
        .post_dump = &notify_gdb_of_libraries,
    };
    debuggerd_init(&callbacks);
#endif
}

在debuggerd_init方法中,注册了用于处理signal的debuggerd_signal_handler。

void debuggerd_init(debuggerd_callbacks_t* callbacks) {
    ...
    struct sigaction action;
    memset(&action, 0, sizeof(action));
    sigfillset(&action.sa_mask);
    action.sa_sigaction = debuggerd_signal_handler;
    action.sa_flags = SA_RESTART | SA_SIGINFO;

    // Use the alternate signal stack if available so we can catch stack overflows.
    action.sa_flags |= SA_ONSTACK;
    debuggerd_register_handlers(&action);
}

// /system/core/debuggerd/include/debuggerd/handler.h
static void __attribute__((__unused__)) debuggerd_register_handlers(struct sigaction* action) {
    sigaction(SIGABRT, action, nullptr);
    sigaction(SIGBUS, action, nullptr);
    sigaction(SIGFPE, action, nullptr);
    sigaction(SIGILL, action, nullptr);
    sigaction(SIGSEGV, action, nullptr);
#if defined(SIGSTKFLT)
    sigaction(SIGSTKFLT, action, nullptr);
#endif
    sigaction(SIGSYS, action, nullptr);
    sigaction(SIGTRAP, action, nullptr);
    sigaction(DEBUGGER_SIGNAL, action, nullptr);
}

在debuggerd_signal_handler中,会通过clone子线程启动crashdump,用于记录崩溃日志,等子线程执行完毕后,通过resend_signal kill掉当前进程。

static void debuggerd_signal_handler(int signal_number, siginfo_t* info, void* context) {
  ...
  // clone子线程启动crashdump
  pid_t child_pid =
    clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
          CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID,
          &thread_info, nullptr, nullptr, &thread_info.pseudothread_tid);
  if (child_pid == -1) {
    fatal_errno("failed to spawn debuggerd dispatch thread");
  }

  // 等待子线程启动
  futex_wait(&thread_info.pseudothread_tid, -1);

  // 等待子线程执行完毕
  futex_wait(&thread_info.pseudothread_tid, child_pid);

  ...
  if (info->si_signo == DEBUGGER_SIGNAL) {
    ...
  } else {
    // 重新发送信号
    resend_signal(info);
  }
}

static void resend_signal(siginfo_t* info) {
  // Signals can either be fatal or nonfatal.
  // For fatal signals, crash_dump will send us the signal we crashed with
  // before resuming us, so that processes using waitpid on us will see that we
  // exited with the correct exit status (e.g. so that sh will report
  // "Segmentation fault" instead of "Killed"). For this to work, we need
  // to deregister our signal handler for that signal before continuing.
  if (info->si_signo != DEBUGGER_SIGNAL) {
    signal(info->si_signo, SIG_DFL); // 设置成系统默认处理,会kill掉当前进程
    int rc = syscall(SYS_rt_tgsigqueueinfo, __getpid(), __gettid(), info->si_signo, info);
    if (rc != 0) {
      fatal_errno("failed to resend signal during crash");
    }
  }
}

在crash_dump的main方法中,fork子进程与tombstoned通信,记录crash日志;并通知AMS native crash。

// /system/core/debuggerd/crash_dump.cpp
int main(int argc, char** argv) {
  ...
  // fork子进程
  pid_t forkpid = fork();
  if (forkpid == -1) {
    PLOG(FATAL) << "fork failed";
  } else if (forkpid == 0) {
    fork_exit_read.reset();
  } else {
    // 等待子进程处理完毕
    fork_exit_write.reset();
    char buf;
    TEMP_FAILURE_RETRY(read(fork_exit_read.get(), &buf, sizeof(buf)));
    _exit(0);
  }
  
  ...
  // 连接tombstoned,输出日志
  {
    ATRACE_NAME("tombstoned_connect");
    LOG(INFO) << "obtaining output fd from tombstoned, type: " << dump_type;
    g_tombstoned_connected =
        tombstoned_connect(g_target_thread, &g_tombstoned_socket, &g_output_fd, dump_type);
  }

  if (g_tombstoned_connected) {
    if (TEMP_FAILURE_RETRY(dup2(g_output_fd.get(), STDOUT_FILENO)) == -1) {
      PLOG(ERROR) << "failed to dup2 output fd (" << g_output_fd.get() << ") to STDOUT_FILENO";
    }
  } else {
    unique_fd devnull(TEMP_FAILURE_RETRY(open("/dev/null", O_RDWR)));
    TEMP_FAILURE_RETRY(dup2(devnull.get(), STDOUT_FILENO));
    g_output_fd = std::move(devnull);
  }

  ...
  // 通知AMS
  if (fatal_signal) {
    // Don't try to notify ActivityManager if it just crashed, or we might hang until timeout.
    if (thread_info[target_process].thread_name != "system_server") {
      activity_manager_notify(target_process, signo, amfd_data);
    }
  }

  ...
  // 通知tombstoned处理完毕
  if (g_tombstoned_connected && !tombstoned_notify_completion(g_tombstoned_socket.get())) {
    LOG(ERROR) << "failed to notify tombstoned of completion";
  }

  return 0;
}

三、崩溃优化(java层)

1、记录日志信息:

记录手机信息、内存信息、Crash日志、屏幕截图等

2、让崩溃更友好一些:

系统崩溃会直接闪退,可以通过自定义handler进行处理,重启App页面,减少直接退出App的场景。
需要注意的是,重启app时,需要退出原来的进程,防止出现其它问题。

Intent intent = new Intent(BaseApplication.this, MainActivity.class);
intent.addFlags(Intent.FLAG_ACTIVITY_NEW_TASK
        | Intent.FLAG_ACTIVITY_CLEAR_TASK |
        Intent.FLAG_ACTIVITY_RESET_TASK_IF_NEEDED);
if (intent.getComponent() != null) {
    // 模拟从Launcher启动
    intent.setAction(Intent.ACTION_MAIN);
    intent.addCategory(Intent.CATEGORY_LAUNCHER);
}
BaseApplication.this.startActivity(intent);
android.os.Process.killProcess(android.os.Process.myPid());
System.exit(10);
3、不崩溃:

在crash过程中通过在主线程中重启looper,防止App崩溃。

原理:系统出现未捕捉的异常后,会将异常一层层向上抛,我们知道主线程开启了looper循环,异常会导致循环退出,最终通过jvm调用到uncaughtException()方法。此时在主线程中通过Looper.loop()重启loop,即可继续处理App中的各种事件。

注意:当在Activity展示过程中crash时,系统会出现黑屏。 可以通过hook替换ActivityThread.mH.mCallback,对Activity的生命周期进行try catch,如果有异常的话,直接关闭准备显示的Activity。

public class CrashHandler implements Thread.UncaughtExceptionHandler {
    @Override
    public void uncaughtException(@NonNull Thread thread, @NonNull Throwable ex) {
        handleExceptionReocrd(ex); // 自动记录日志
        try { // 交给用户记录日志
            if (listener != null) listener.recordException(ex);
        } catch (Throwable e) {
            e.printStackTrace();
        }

        try { // 是否重启APP,重启APP,需要杀掉进程
            if (listener != null && listener.restartApp()) return;
        } catch (Exception e) {
            Log.d(TAG, "uncaughtException->handleByUser:" + Log.getStackTraceString(e));
        }

        // 未重启,是否开启安全模式
        if (safeModelEnable) {
            enterSafeModel(thread);
        } else if (mDefaultHandler != null) {
            // 交给系统处理
            Log.d(TAG, "uncaughtException 交给系统处理");
            mDefaultHandler.uncaughtException(thread, ex);
        } else {
            // 没有系统的处理器,直接退出进程
            Log.w(TAG, "uncaughtException 退出进程");
            android.os.Process.killProcess(android.os.Process.myPid());
            System.exit(10);
        }
    }

    public void enterSafeModel(Thread thread) {
        Log.w(CrashHandler.TAG, "setSafe--- thread-----" + thread.getName());
        if (thread == Looper.getMainLooper().getThread()) {
            while (true) { //开启一个循环
                try {
                    Log.e(TAG, "safeMode: 检测到异常退出,开启looper");
                    Looper.loop();
                } catch (Throwable e) {
                    Log.e(TAG, "safeMode: 检测到异常退出:" + Log.getStackTraceString(e));
                }
            }
        }
    }
}
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,258评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,335评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,225评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,126评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,140评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,098评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,018评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,857评论 0 273
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,298评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,518评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,678评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,400评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,993评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,638评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,801评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,661评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,558评论 2 352

推荐阅读更多精彩内容