ANR是如何产生的

大家在在使用一个Android应用的过程中应该都会遇到一件让人讨厌的事情,那就是用着用着弹出来像下面一个对话框,让你选择等待还是确定,选择了确定就会强制结束你正在使用的应用,而选择等待之后回到应用也很有可能还是无法操作应用,这就是常说的ANR(Application Not Responding),ANR一旦发生非常影响用户的体验。

ANR

那么ANR是怎么产生的呢?既然应用已经无响应了,那一定不是应用自己弹出这个对话框,一定是系统监控到应用一段时间无响应后弹出次对话框告知用户并等待用户的操作。那么我们就在Android源码里面搜索ANR和Dialog关键字grep -ri ANR ./ | grep -i dialog,然后得到了这样的结果./base/services/core/java/com/android/server/am/AppNotRespondingDialog.java: app.anrDialog = null;,搜索结果太多,这里省略了其它结果,但是可以确定类AppNotRespondingDialog.java就是我们看到的ANR对话框。

现在我们只要找到这个对话框在什么位置展示出来就可以找到系统捕获ANR的逻辑,尝试搜索创建AppNotRespondingDialog.java的位置,找到了这样一个方法:

        // services/core/java/com/android/server/am/ProcessRecord.java
        void showAnrDialogs(AppNotRespondingDialog.Data data) {
            List<Context> contexts = getDisplayContexts(isSilentAnr() /* lastUsedOnly */);
            mAnrDialogs = new ArrayList<>();
            for (int i = contexts.size() - 1; i >= 0; i--) {
                final Context c = contexts.get(i);
                mAnrDialogs.add(new AppNotRespondingDialog(mService, c, data));
            }
            mService.mUiHandler.post(() -> {
                List<AppNotRespondingDialog> dialogs;
                synchronized (mService) {
                    dialogs = mAnrDialogs;
                }
                if (dialogs != null) {
                    forAllDialogs(dialogs, Dialog::show);
                }
            });
        }

这就是我们要找的代码,接着找哪里调用这个方法,用方法名继续搜索

    // services/core/java/com/android/server/am/AppErrors.java
    void handleShowAnrUi(Message msg) {
        ...
            if (mService.mAtmInternal.canShowErrorDialogs() || showBackground) {
                proc.getDialogController().showAnrDialogs(data);
            } else {
                MetricsLogger.action(mContext, MetricsProto.MetricsEvent.ACTION_APP_ANR,
                        AppNotRespondingDialog.CANT_SHOW);
                // Just kill the app if there is no dialog to be shown.
                mService.killAppAtUsersRequest(proc);
            }
        ...
    }

接着往下找,发现是通过发送Handler消息到AMS中出发对话框弹出

    // services/core/java/com/android/server/am/ActivityManagerService.java
    final class UiHandler extends Handler {
        ...

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
                    ...
                    case SHOW_NOT_RESPONDING_UI_MSG: {
                      mAppErrors.handleShowAnrUi(msg);
                      ensureBootCompleted();
                   } break;
              }
              ...
        }
    }

接下来需要找到哪里发送了SHOW_NOT_RESPONDING_UI_MSG消息,感觉已经接近捕获ANR的逻辑了,发送SHOW_NOT_RESPONDING_UI_MSG的地方在类ProcessRecord.javavoid appNotResponding(...)方法中,这个方法很长做了一些dump栈信息的操作,最后发送了对话框消息,在实际发送消息之前还做了一个判断isSilentAnr(),含义是是否是沉默的ANR,如果是沉默的ANR将直接杀死应用进程,void appNotResponding(...)方法节选代码如下:

    void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo,
            String parentShortComponentName, WindowProcessController parentProcess,
            boolean aboveSystem, String annotation, boolean onlyDumpSelf) {
        ....
        synchronized (mService) {
            // mBatteryStatsService can be null if the AMS is constructed with injector only. This
            // will only happen in tests.
            if (mService.mBatteryStatsService != null) {
                mService.mBatteryStatsService.noteProcessAnr(processName, uid);
            }

            if (isSilentAnr() && !isDebugging()) {
                kill("bg anr", ApplicationExitInfo.REASON_ANR, true);
                return;
            }

            // Set the app's notResponding state, and look up the errorReportReceiver
            makeAppNotRespondingLocked(activityShortComponentName,
                    annotation != null ? "ANR " + annotation : "ANR", info.toString());

            // mUiHandler can be null if the AMS is constructed with injector only. This will only
            // happen in tests.
            if (mService.mUiHandler != null) {
                // Bring up the infamous App Not Responding dialog
                Message msg = Message.obtain();
                msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
                msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);

                mService.mUiHandler.sendMessage(msg);
            }
        }

找到这里还是没有发现任何关于什么情况下会触发ANR的逻辑,接着看哪里调用了void appNotResponding(...),这一次搜索下来发现调用的地方太多了,说明我们已经找到触发ANR的逻辑,需要查看每一个调用的地方,刨去间接引用之后有下面几个地方的触发逻辑:

  • ContentProvider
    ContentResolver暴露了一个接口appNotRespondingViaProvider,这个方法的实现没有找到在哪里,但是根据方法名称可以判断是触发ANR的,这个方法的调用是在ContentProviderClient.java
    private class NotRespondingRunnable implements Runnable {
        @Override
        public void run() {
            Log.w(TAG, "Detected provider not responding: " + mContentProvider);
            mContentResolver.appNotRespondingViaProvider(mContentProvider);
        }
    }

    private void beforeRemote() {
        if (mAnrRunnable != null) {
            sAnrHandler.postDelayed(mAnrRunnable, mAnrTimeout);
        }
    }

    /** See {@link ContentProvider#query ContentProvider.query} */
    @Override
    public @Nullable Cursor query(@NonNull Uri uri, @Nullable String[] projection,
            Bundle queryArgs, @Nullable CancellationSignal cancellationSignal)
                    throws RemoteException {
        Objects.requireNonNull(uri, "url");

        beforeRemote();
        ....
    }

    /** See {@link ContentProvider#getType ContentProvider.getType} */
    @Override
    public @Nullable String getType(@NonNull Uri url) throws RemoteException {
        Objects.requireNonNull(url, "url");

        beforeRemote();
        ...
    }
    ...

可以看到在调用ContentProvider的query、getType等方法时会首先发起一个延时任务,任务等内容就是触发ANR。

  • Input Dispatching Timed Out
    在AMS当中有这样一个方法
    /**
     * Handle input dispatching timeouts.
     * @return whether input dispatching should be aborted or not.
     */
    boolean inputDispatchingTimedOut(ProcessRecord proc, String activityShortComponentName,
            ApplicationInfo aInfo, String parentShortComponentName,
            WindowProcessController parentProcess, boolean aboveSystem, String reason) {
        ....
        if (proc != null) {
            synchronized (this) {
                if (proc.isDebugging()) {
                    return false;
                }

                if (proc.getActiveInstrumentation() != null) {
                    Bundle info = new Bundle();
                    info.putString("shortMsg", "keyDispatchingTimedOut");
                    info.putString("longMsg", annotation);
                    finishInstrumentationLocked(proc, Activity.RESULT_CANCELED, info);
                    return true;
                }
            }
            mAnrHelper.appNotResponding(proc, activityShortComponentName, aInfo,
                    parentShortComponentName, parentProcess, aboveSystem, annotation);
        }

        return true;
    }

一路搜索下去最终找到下面这样一段代码,其中onAnrLocked(mAwaitedFocusedApplication);调用的就是我们刚才找到的上面的方法:

// Default input dispatching timeout if there is no focused application or paused window
// from which to determine an appropriate dispatching timeout.
constexpr std::chrono::nanoseconds DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5s;

/**
 * Check if any of the connections' wait queues have events that are too old.
 * If we waited for events to be ack'ed for more than the window timeout, raise an ANR.
 * Return the time at which we should wake up next.
 */
nsecs_t InputDispatcher::processAnrsLocked() {
    const nsecs_t currentTime = now();
    nsecs_t nextAnrCheck = LONG_LONG_MAX;
    // Check if we are waiting for a focused window to appear. Raise ANR if waited too long
    if (mNoFocusedWindowTimeoutTime.has_value() && mAwaitedFocusedApplication != nullptr) {
        if (currentTime >= *mNoFocusedWindowTimeoutTime) {
            onAnrLocked(mAwaitedFocusedApplication);
            mAwaitedFocusedApplication.clear();
            return LONG_LONG_MIN;
        } else {
            // Keep waiting
            const nsecs_t millisRemaining = ns2ms(*mNoFocusedWindowTimeoutTime - currentTime);
            ALOGW("Still no focused window. Will drop the event in %" PRId64 "ms", millisRemaining);
            nextAnrCheck = *mNoFocusedWindowTimeoutTime;
        }
    }

    // Check if any connection ANRs are due
    nextAnrCheck = std::min(nextAnrCheck, mAnrTracker.firstTimeout());
    if (currentTime < nextAnrCheck) { // most likely scenario
        return nextAnrCheck;          // everything is normal. Let's check again at nextAnrCheck
    }

    // If we reached here, we have an unresponsive connection.
    sp<Connection> connection = getConnectionLocked(mAnrTracker.firstToken());
    if (connection == nullptr) {
        ALOGE("Could not find connection for entry %" PRId64, mAnrTracker.firstTimeout());
        return nextAnrCheck;
    }
    connection->responsive = false;
    // Stop waking up for this unresponsive connection
    mAnrTracker.eraseToken(connection->inputChannel->getConnectionToken());
    onAnrLocked(connection);
    return LONG_LONG_MIN;
}

也就是说Dispatch一个按键时间后如果5s中没有回应就触发相应应用进程的ANR。

  • BroadCast Timed Out
    在类BroadcastQueue.java中维护这广播队列以及对广播的分发处理,检查广播的处理状态,下面的方法就是不断的检查当前处理中的广播的状态:
    final void broadcastTimeoutLocked(boolean fromMsg) {
        ...
        BroadcastRecord r = mDispatcher.getActiveBroadcastLocked();
        if (fromMsg) {
            if (!mService.mProcessesReady) {
                // Only process broadcast timeouts if the system is ready; some early
                // broadcasts do heavy work setting up system facilities
                return;
            }

            // If the broadcast is generally exempt from timeout tracking, we're done
            if (r.timeoutExempt) {
                if (DEBUG_BROADCAST) {
                    Slog.i(TAG_BROADCAST, "Broadcast timeout but it's exempt: "
                            + r.intent.getAction());
                }
                return;
            }

            long timeoutTime = r.receiverTime + mConstants.TIMEOUT;
            if (timeoutTime > now) {
                // We can observe premature timeouts because we do not cancel and reset the
                // broadcast timeout message after each receiver finishes.  Instead, we set up
                // an initial timeout then kick it down the road a little further as needed
                // when it expires.
                if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
                        "Premature timeout ["
                        + mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
                        + timeoutTime);
                setBroadcastTimeoutLocked(timeoutTime);
                return;
            }
        }
        ...
        if (!debugging && anrMessage != null) {
            mService.mAnrHelper.appNotResponding(app, anrMessage);
        }
  }

把一个广发交给接收者时调用这个方法检查是否处理超时,如果超时则触发接收应用进程ANR,超时时间是10秒钟。

  • Service Time Out
    // How long we wait for a service to finish executing.
    static final int SERVICE_TIMEOUT = 20*1000;

    // How long we wait for a service to finish executing.
    static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
    ...
    void serviceTimeout(ProcessRecord proc) {
        String anrMessage = null;
        synchronized(mAm) {
            if (proc.isDebugging()) {
                // The app's being debugged, ignore timeout.
                return;
            }
            if (proc.executingServices.size() == 0 || proc.thread == null) {
                return;
            }
            final long now = SystemClock.uptimeMillis();
            final long maxTime =  now -
                    (proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
            ServiceRecord timeout = null;
            long nextTime = 0;
            for (int i=proc.executingServices.size()-1; i>=0; i--) {
                ServiceRecord sr = proc.executingServices.valueAt(i);
                if (sr.executingStart < maxTime) {
                    timeout = sr;
                    break;
                }
                if (sr.executingStart > nextTime) {
                    nextTime = sr.executingStart;
                }
            }
            if (timeout != null && mAm.mProcessList.mLruProcesses.contains(proc)) {
                Slog.w(TAG, "Timeout executing service: " + timeout);
                StringWriter sw = new StringWriter();
                PrintWriter pw = new FastPrintWriter(sw, false, 1024);
                pw.println(timeout);
                timeout.dump(pw, "    ");
                pw.close();
                mLastAnrDump = sw.toString();
                mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
                mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
                anrMessage = "executing service " + timeout.shortInstanceName;
            } else {
                Message msg = mAm.mHandler.obtainMessage(
                        ActivityManagerService.SERVICE_TIMEOUT_MSG);
                msg.obj = proc;
                mAm.mHandler.sendMessageAtTime(msg, proc.execServicesFg
                        ? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
            }
        }

        if (anrMessage != null) {
            mAm.mAnrHelper.appNotResponding(proc, anrMessage);
        }
    }

这个方法的大致意思是比较Service生命周期函数开始执行时间与当前时间的差是否超时,对于前台服务是20s,后台服务则是20s的十倍,也就是200s,如果超过这个时间则触发ANR。

根据对源码的搜索情况看触发ANR的逻辑有这四种情况:

  • ContentProvider
  • Input Dispatching Timed Out
  • BroadCast Timed Out
  • Service Time Out
    所以我们写代码时在上面四种执行逻辑里面一定不要执行耗时操作。
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,324评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,356评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,328评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,147评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,160评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,115评论 1 296
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,025评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,867评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,307评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,528评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,688评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,409评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,001评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,657评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,811评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,685评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,573评论 2 353

推荐阅读更多精彩内容