遇到这么个奇葩的事儿,与三方合作的Android应用,很低的概率出现crash问题,根据Crash的log,明显的是我们这边抛出了异常导致程序挂掉,但是这个异常是由于通过binder调用,服务端抛过来的异常,但是负责服务端的第三方,就是不承认,说进程是你这边挂的...而且抛到Client端的异常也没有了服务端的调用栈,不论怎么给他分析都不停扯皮。
于是扯皮还不如想办法让这个调用直接挂到服务进程来的直接。
先看下binder的进程间异常传递,根据由Aidl生成的代码:
Client端:
private static class Proxy implements personal.jayhou.mydemos.aidl.ICallback {
private android.os.IBinder mRemote;
Proxy(android.os.IBinder remote) {
mRemote = remote;
}
@Override
public android.os.IBinder asBinder() {
return mRemote;
}
public java.lang.String getInterfaceDescriptor() {
return DESCRIPTOR;
}
@Override
public void callback() throws android.os.RemoteException {
android.os.Parcel _data = android.os.Parcel.obtain();
android.os.Parcel _reply = android.os.Parcel.obtain();
try {
_data.writeInterfaceToken(DESCRIPTOR);
mRemote.transact(Stub.TRANSACTION_callback, _data, _reply, 0);
_reply.readException();
} finally {
_reply.recycle();
_data.recycle();
}
}
}
Service端:
@Override
public boolean onTransact(int code, android.os.Parcel data, android.os.Parcel reply, int flags) throws android.os.RemoteException {
java.lang.String descriptor = DESCRIPTOR;
switch (code) {
case INTERFACE_TRANSACTION: {
reply.writeString(descriptor);
return true;
}
case TRANSACTION_callback: {
data.enforceInterface(descriptor);
this.callback();
reply.writeNoException();
return true;
}
default: {
return super.onTransact(code, data, reply, flags);
}
}
}
可以看到,Client端执行了mRemote.transact后,会通过binder调用到Service端的onTransact方法,这里如果遇到服务端业务异常,会向上抛出异常到:
android/os/Binder.java
// Entry point from android_util_Binder.cpp's onTransact
private boolean execTransact(int code, long dataObj, long replyObj,
int flags) {
Parcel data = Parcel.obtain(dataObj);
Parcel reply = Parcel.obtain(replyObj);
// theoretically, we should call transact, which will call onTransact,
// but all that does is rewind it, and we just got these from an IPC,
// so we'll just call it directly.
boolean res;
// Log any exceptions as warnings, don't silently suppress them.
// If the call was FLAG_ONEWAY then these exceptions disappear into the ether.
final boolean tracingEnabled = Binder.isTracingEnabled();
try {
if (tracingEnabled) {
Trace.traceBegin(Trace.TRACE_TAG_ALWAYS, getClass().getName() + ":" + code);
}
res = onTransact(code, data, reply, flags);
} catch (RemoteException|RuntimeException e) {
if (LOG_RUNTIME_EXCEPTION) {
Log.w(TAG, "Caught a RuntimeException from the binder stub implementation.", e);
}
if ((flags & FLAG_ONEWAY) != 0) {
if (e instanceof RemoteException) {
Log.w(TAG, "Binder call failed.", e);
} else {
Log.w(TAG, "Caught a RuntimeException from the binder stub implementation.", e);
}
} else {
reply.setDataPosition(0);
reply.writeException(e);
}
res = true;
} finally {
if (tracingEnabled) {
Trace.traceEnd(Trace.TRACE_TAG_ALWAYS);
}
}
checkParcel(this, code, reply, "Unreasonably large binder reply buffer");
reply.recycle();
data.recycle();
// Just in case -- we are done with the IPC, so there should be no more strict
// mode violations that have gathered for this thread. Either they have been
// parceled and are now in transport off to the caller, or we are returning back
// to the main transaction loop to wait for another incoming transaction. Either
// way, strict mode begone!
StrictMode.clearGatheredViolations();
return res;
}
这里RemoteException和RuntimeException都会被Catch住,然后写入reply中传递给Client 端,然后执行_reply.readException():
/**
* Special function for reading an exception result from the header of
* a parcel, to be used after receiving the result of a transaction. This
* will throw the exception for you if it had been written to the Parcel,
* otherwise return and let you read the normal result data from the Parcel.
*
* @see #writeException
* @see #writeNoException
*/
public final void readException() {
int code = readExceptionCode();
if (code != 0) {
String msg = readString();
readException(code, msg);
}
}
/**
* Throw an exception with the given message. Not intended for use
* outside the Parcel class.
*
* @param code Used to determine which exception class to throw.
* @param msg The exception message.
*/
public final void readException(int code, String msg) {
switch (code) {
case EX_PARCELABLE:
if (readInt() > 0) {
SneakyThrow.sneakyThrow(
(Exception) readParcelable(Parcelable.class.getClassLoader()));
} else {
throw new RuntimeException(msg + " [missing Parcelable]");
}
case EX_SECURITY:
throw new SecurityException(msg);
case EX_BAD_PARCELABLE:
throw new BadParcelableException(msg);
case EX_ILLEGAL_ARGUMENT:
throw new IllegalArgumentException(msg);
case EX_NULL_POINTER:
throw new NullPointerException(msg);
case EX_ILLEGAL_STATE:
throw new IllegalStateException(msg);
case EX_NETWORK_MAIN_THREAD:
throw new NetworkOnMainThreadException();
case EX_UNSUPPORTED_OPERATION:
throw new UnsupportedOperationException(msg);
case EX_SERVICE_SPECIFIC:
throw new ServiceSpecificException(readInt(), msg);
}
throw new RuntimeException("Unknown exception code: " + code
+ " msg " + msg);
}
这样异常就会在Client端进程抛出,然后遇到不讲理的就不好了。
注意到:
android/os/Binder.java
if ((flags & FLAG_ONEWAY) != 0) {
if (e instanceof RemoteException) {
Log.w(TAG, "Binder call failed.", e);
} else {
Log.w(TAG, "Caught a RuntimeException from the binder stub implementation.", e);
}
} else {
reply.setDataPosition(0);
reply.writeException(e);
}
如果是oneway的调用,服务端抛出异常被catch后,并没有将异常写入reply中,这就是说,服务端发生异常,虽然挂不了,但是Client端也不会受牵连,并且还会在Service端进程打印出异常信息,让锅无处可甩~于是告知对方,这个问题更新了sdk再去复现,让对方乖乖修改调用方法,静静等待问题再次出现... 😄