AddressSanitizer(简称 ASAN)一直是一个检测分析 C/C++ 内存问题很方便的工具。WebRTC 工程集成了 ASAN,只要配置一个简单的选项即可对整个工程打开或关闭 ASAN,具体来说是 is_asan
选项。is_asan
选项的默认值为 false
,在 args.gn
文件中写入 is_asan = true
行可以对整个工程打开 ASAN,在 args.gn
文件中写入 is_asan = false
行或者不配置 is_asan
选项可以对整个工程关闭 ASAN。
OpenRTCClient 工程的 Linux debug 构建是开了 ASAN 的。如果一切选项配置妥当,执行一个 C/C++ 应用程序,在出现内存问题时,ASAN 将调用 symbolizer 把出现内存问题的相关堆栈(如内存分配的堆栈和内存释放的内存堆栈)的内存地址转为文件行号和符号名。我们可以配置环境变量 ASAN_SYMBOLIZER_PATH
指向我们选择的 llvm symbolizer,如 export ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-11
,来告诉 ASAN 在需要把内存地址符号化时用什么工具。不配置环境变量 ASAN_SYMBOLIZER_PATH
时,ASAN 会尝试在 PATH
环境变量的各个路径下寻找名为 llvm-symbolizer
的可执行文件来用。如果既没有配置 ASAN_SYMBOLIZER_PATH
指向合适的 llvm symbolizer,PATH
环境变量的各个路径下也找不到名为 llvm-symbolizer
的可执行文件,则 ASAN 只能简单地把内存地址吐出来。
一次内存地址符号化失败
OpenRTCClient 工程中的示例应用 loop_connect
,编译完成,在执行之前配置了环境变量 ASAN_SYMBOLIZER_PATH
,在 loop_connect
执行过程中,出现内存问题时,依然没能成功将内存地址符号化,ASAN 输出如下:
=================================================================
==51148==ERROR: AddressSanitizer: heap-use-after-free on address 0x61200014eb40 at pc 0x5639128a0a85 bp 0x7ffcfdbb6b30 sp 0x7ffcfdbb6b28
READ of size 8 at 0x61200014eb40 thread T0
==51148==WARNING: invalid path to external symbolizer!
==51148==WARNING: Failed to use and restart external symbolizer!
#0 0x5639128a0a84 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x32fda84) (BuildId: 542ad276a9f6ad54)
#1 0x563915cdc29d (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x673929d) (BuildId: 542ad276a9f6ad54)
#2 0x563910cd2bc1 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172fbc1) (BuildId: 542ad276a9f6ad54)
#3 0x563910cd2c08 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172fc08) (BuildId: 542ad276a9f6ad54)
#4 0x563910cd52f6 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17322f6) (BuildId: 542ad276a9f6ad54)
#5 0x563910cd3b40 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1730b40) (BuildId: 542ad276a9f6ad54)
#6 0x563910ccf40d (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172c40d) (BuildId: 542ad276a9f6ad54)
#7 0x563910ccbad9 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1728ad9) (BuildId: 542ad276a9f6ad54)
#8 0x7efd969cc0b2 (/lib/x86_64-linux-gnu/libc.so.6+0x240b2) (BuildId: 9fdb74e7b217d06c93172a8243f8547f947ee6d1)
0x61200014eb40 is located 0 bytes inside of 320-byte region [0x61200014eb40,0x61200014ec80)
freed by thread T0 here:
#0 0x563910ca3887 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1700887) (BuildId: 542ad276a9f6ad54)
#1 0x5639122c1791 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x2d1e791) (BuildId: 542ad276a9f6ad54)
#2 0x563910cbbc76 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1718c76) (BuildId: 542ad276a9f6ad54)
#3 0x563910cbbb1f (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1718b1f) (BuildId: 542ad276a9f6ad54)
#4 0x563910cbdbfa (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x171abfa) (BuildId: 542ad276a9f6ad54)
#5 0x563910cb74c0 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17144c0) (BuildId: 542ad276a9f6ad54)
#6 0x563910cb1384 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x170e384) (BuildId: 542ad276a9f6ad54)
#7 0x563910ccd4c4 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a4c4) (BuildId: 542ad276a9f6ad54)
#8 0x563910ccd42c (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a42c) (BuildId: 542ad276a9f6ad54)
#9 0x563910ccd105 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a105) (BuildId: 542ad276a9f6ad54)
#10 0x563910cbc8ee (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17198ee) (BuildId: 542ad276a9f6ad54)
#11 0x563910cbc6e5 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x17196e5) (BuildId: 542ad276a9f6ad54)
#12 0x563910ccd858 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x172a858) (BuildId: 542ad276a9f6ad54)
#13 0x563910ccbc84 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1728c84) (BuildId: 542ad276a9f6ad54)
#14 0x563910ccad26 (~/OpenRTCClient/build/linux/x64/debug/loop_connect+0x1727d26) (BuildId: 542ad276a9f6ad54)
ASAN 提示说,拿到的 llvm symbolizer 地址无效,内存地址符号化失败。
ASAN 的实现
AddressSanitizer 是 LLVM 工程的 compiler-rt
子工程的一部分。在 GitHub 下载 llvm-project 工程的代码,compiler-rt
的代码就位于 llvm-project/compiler-rt
目录下。一般来说,我们需要构建 LLVM/Clang 来构建 compiler-rt
。我们可以把 compiler-rt
和 llvm 及 clang 放在一起构建,但我们也可以分开来构建。
要把 compiler-rt
和 llvm 及 clang 放在一起构建,则把 compiler-rt
添加到传给 cmake 的 -DLLVM_ENABLE_RUNTIMES=
选项即可。
要分开构建,则首先单独 构建 LLVM 以获得 llvm-config
二进制可执行文件,然后运行如下命令:
$ cd llvm-project
$ git checkout -t origin/release/14.x
$ mkdir build-compiler-rt
$ cd build-compiler-rt
$ cmake ../compiler-rt -DLLVM_CONFIG_PATH=/path/to/llvm-config
$ make
(OpenRTCClient 工程所基于的 WebRTC 代码库中的 llvm 已经更新到了 llvm-14,因而这里也切到 llvm-14 的分支来构建。)
编译生成的二进制库文件主要位于 llvm-project/build-compiler-rt/lib/linux/
,如:
llvm-project/build-compiler-rt$ ls lib/linux/
clang_rt.crtbegin-x86_64.o libclang_rt.hwasan_aliases-x86_64.so libclang_rt.scudo-x86_64.a
clang_rt.crtend-x86_64.o libclang_rt.hwasan_cxx-x86_64.a libclang_rt.scudo-x86_64.so
libclang_rt.asan_cxx-x86_64.a libclang_rt.hwasan_cxx-x86_64.a.syms libclang_rt.tsan_cxx-x86_64.a
libclang_rt.asan_cxx-x86_64.a.syms libclang_rt.hwasan-x86_64.a libclang_rt.tsan_cxx-x86_64.a.syms
libclang_rt.asan-preinit-x86_64.a libclang_rt.hwasan-x86_64.a.syms libclang_rt.tsan-x86_64.a
libclang_rt.asan_static-x86_64.a libclang_rt.hwasan-x86_64.so libclang_rt.tsan-x86_64.a.syms
libclang_rt.asan-x86_64.a libclang_rt.lsan-x86_64.a libclang_rt.tsan-x86_64.so
libclang_rt.asan-x86_64.a.syms libclang_rt.msan_cxx-x86_64.a libclang_rt.ubsan_minimal-x86_64.a
libclang_rt.asan-x86_64.so libclang_rt.msan_cxx-x86_64.a.syms libclang_rt.ubsan_minimal-x86_64.a.syms
libclang_rt.builtins-x86_64.a libclang_rt.msan-x86_64.a libclang_rt.ubsan_minimal-x86_64.so
libclang_rt.cfi_diag-x86_64.a libclang_rt.msan-x86_64.a.syms libclang_rt.ubsan_standalone_cxx-x86_64.a
libclang_rt.cfi-x86_64.a libclang_rt.orc-x86_64.a libclang_rt.ubsan_standalone_cxx-x86_64.a.syms
libclang_rt.dd-x86_64.a libclang_rt.profile-x86_64.a libclang_rt.ubsan_standalone-x86_64.a
libclang_rt.dfsan-x86_64.a libclang_rt.safestack-x86_64.a libclang_rt.ubsan_standalone-x86_64.a.syms
libclang_rt.dfsan-x86_64.a.syms libclang_rt.scudo_cxx_minimal-x86_64.a libclang_rt.ubsan_standalone-x86_64.so
libclang_rt.dyndd-x86_64.so libclang_rt.scudo_cxx-x86_64.a libclang_rt.xray-basic-x86_64.a
libclang_rt.gwp_asan-x86_64.a libclang_rt.scudo_minimal-x86_64.a libclang_rt.xray-fdr-x86_64.a
libclang_rt.hwasan_aliases_cxx-x86_64.a libclang_rt.scudo_minimal-x86_64.so libclang_rt.xray-profiling-x86_64.a
libclang_rt.hwasan_aliases_cxx-x86_64.a.syms libclang_rt.scudo_standalone_cxx-x86_64.a libclang_rt.xray-x86_64.a
libclang_rt.hwasan_aliases-x86_64.a libclang_rt.scudo_standalone-x86_64.a
libclang_rt.hwasan_aliases-x86_64.a.syms libclang_rt.scudo_standalone-x86_64.so
开启 AddressSanitizer 在编译器/链接器层面,是给编译器和链接器加上特殊的参数 -fsanitize=address
,如链接 OpenRTCClient 的示例应用 loop_connect
实际执行的命令如下:
python3 "../../../../webrtc/build/toolchain/gcc_link_wrapper.py" --output="./loop_connect" -- ../../../../build_system/llvm-build/linux/linux/Release+Asserts/bin/clang++ -fuse-ld=lld -Wl,--fatal-warnings -Wl,--build-id -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--color-diagnostics -Wl,--no-call-graph-profile-sort -m64 -no-canonical-prefixes -Wl,--gdb-index -rdynamic --sysroot=../../../../build_system/sysroot/linux/debian_sid_amd64-sysroot -fsanitize=address -pie -Wl,--disable-new-dtags -Wl,-u_sanitizer_options_link_helper -fsanitize=address -o "./loop_connect" -Wl,--start-group @"./loop_connect.rsp" -Wl,--end-group -lX11 -lXcomposite -lXext -lXrender -latomic -ldl -lpthread -lrt -lgmodule-2.0 -lgthread-2.0 -lgtk-3 -lgdk-3 -lpangocairo-1.0 -lpango-1.0 -lharfbuzz -latk-1.0 -lcairo-gobject -lcairo -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -lm -lz
链接器在看到 -fsanitize=address
参数时,会根据编译的目标架构,去链接前面看到的 compiler-rt
编译出来的某个 libclang_rt.asan*
库。对于 OpenRTCClient 的示例应用 loop_connect
来说,链接可执行文件时,传入了 --sysroot
参数,这样就会在 --sysroot
参数指定的路径下查找编译链接时需要的所有库文件和头文件等。具体来说,链接 loop_connect
时将链接到 OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/linux
或 OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu
目录下对应于目标架构的 libclang_rt.asan*
库文件。
为了能够调试 AddressSanitizer,我们需要让链接器去链接我们编译出来的 compiler-rt
库。具体做法是,把 OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/linux
和 OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu
随意改个其它名字,同时在 OpenRTCClient/build_system/llvm-build/linux/linux/Release+Asserts/lib/clang/14.0.0/lib/
目录下创建一个名为 linux
的符号链接指向我们编译 compiler-rt
的目录 llvm-project/build-compiler-rt/lib/linux
,这样我们修改 compiler-rt
的代码,编译 compiler-rt
,然后链接 loop_connect
,会将我们修改过的 compiler-rt
代码链接进去。
AddressSanitizer 找不到 symbolizer 问题分析
寻着 AddressSanitizer 给出来的提示信息,在 compiler-rt
的代码中搜常量字符串 "WARNING: invalid path to external symbolizer!"
,我们可以发现,它位于 llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp
,相关的代码如下:
bool SymbolizerProcess::StartSymbolizerSubprocess() {
if (!FileExists(path_)) {
if (!reported_invalid_path_) {
Report("WARNING: invalid path to external symbolizer!\n");
reported_invalid_path_ = true;
}
return false;
}
const char *argv[kArgVMax];
GetArgV(path_, argv);
pid_t pid;
我们可以修改这里的代码,来查下 AddressSanitizer 在这里看到的 symbolizer 的地址 path_
具体是什么。可以看到,这里的 symbolizer 的地址 path_
具体是 /media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer
。这个值貌似跟我们通过环境变量 ASAN_SYMBOLIZER_PATH
配置的地址完全没有关系。
path_
的值是在 SymbolizerProcess
类的构造函数中传入的,具体的代码如下:
SymbolizerProcess::SymbolizerProcess(const char *path, bool use_posix_spawn)
: path_(path),
input_fd_(kInvalidFd),
output_fd_(kInvalidFd),
times_restarted_(0),
failed_to_start_(false),
reported_invalid_path_(false),
use_posix_spawn_(use_posix_spawn) {
CHECK(path_);
CHECK_NE(path_[0], '\0');
}
把我们的可执行文件丢进 GDB 执行,在 SymbolizerProcess
类的构造函数这里加个断点,可以看到如下这样的调用堆栈:
#0 __sanitizer::SymbolizerProcess::SymbolizerProcess(char const*, bool)
(use_posix_spawn=false, path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", this=0x7ffff7fab000) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:456
#1 __sanitizer::LLVMSymbolizerProcess::LLVMSymbolizerProcess(char const*)
(path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", this=0x7ffff7fab000) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:240
#2 __sanitizer::LLVMSymbolizer::LLVMSymbolizer(char const*, __sanitizer::LowLevelAllocator*)
(this=0x7ffff7fb4000, path=0x7ffff3403000 "/media/data/multimedia/OpenRTCClient/build/linux/x64/debug//../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer", allocator=<optimized out>) at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:292
#3 0x0000555556c45532 in __sanitizer::ChooseExternalSymbolizer (allocator=<optimized out>)
at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common.h:1075
#4 __sanitizer::ChooseSymbolizerTools (allocator=<optimized out>, list=<synthetic pointer>)
at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:487
#5 __sanitizer::Symbolizer::PlatformInit() ()
at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:500
#6 0x0000555556c42455 in __sanitizer::Symbolizer::GetOrInit() ()
at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp:24
#7 0x0000555556c457ad in __sanitizer::Symbolizer::LateInitialize() ()
at ~llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp:505
#8 0x0000555556c199fd in __asan::AsanInitInternal() () at ~llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:495
#9 0x00007ffff7fe0ce6 in () at /lib64/ld-linux-x86-64.so.2
#10 0x00007ffff7fd013a in () at /lib64/ld-linux-x86-64.so.2
#11 0x0000000000000001 in ()
在 llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp
文件中定义的 __sanitizer::ChooseExternalSymbolizer ()
函数,我们可以看到 SymbolizerProcess
对象的 path_
的来源:
static SymbolizerTool *ChooseExternalSymbolizer(LowLevelAllocator *allocator) {
const char *path = common_flags()->external_symbolizer_path;
if (path && internal_strchr(path, '%')) {
char *new_path = (char *)InternalAlloc(kMaxPathLength);
SubstituteForFlagValue(path, new_path, kMaxPathLength);
path = new_path;
}
const char *binary_name = path ? StripModuleName(path) : "";
static const char kLLVMSymbolizerPrefix[] = "llvm-symbolizer";
if (path && path[0] == '\0') {
VReport(2, "External symbolizer is explicitly disabled.\n");
return nullptr;
} else if (!internal_strncmp(binary_name, kLLVMSymbolizerPrefix,
internal_strlen(kLLVMSymbolizerPrefix))) {
VReport(2, "Using llvm-symbolizer at user-specified path: %s\n", path);
return new(*allocator) LLVMSymbolizer(path, allocator);
} else if (!internal_strcmp(binary_name, "atos")) {
#if SANITIZER_MAC
VReport(2, "Using atos at user-specified path: %s\n", path);
return new(*allocator) AtosSymbolizer(path, allocator);
#else // SANITIZER_MAC
Report("ERROR: Using `atos` is only supported on Darwin.\n");
Die();
#endif // SANITIZER_MAC
} else if (!internal_strcmp(binary_name, "addr2line")) {
VReport(2, "Using addr2line at user-specified path: %s\n", path);
return new(*allocator) Addr2LinePool(path, allocator);
} else if (path) {
Report("ERROR: External symbolizer path is set to '%s' which isn't "
"a known symbolizer. Please set the path to the llvm-symbolizer "
"binary or other known tool.\n", path);
Die();
}
// Otherwise symbolizer program is unknown, let's search $PATH
CHECK(path == nullptr);
#if SANITIZER_MAC
if (const char *found_path = FindPathToBinary("atos")) {
VReport(2, "Using atos found at: %s\n", found_path);
return new(*allocator) AtosSymbolizer(found_path, allocator);
}
#endif // SANITIZER_MAC
if (const char *found_path = FindPathToBinary("llvm-symbolizer")) {
VReport(2, "Using llvm-symbolizer found at: %s\n", found_path);
return new(*allocator) LLVMSymbolizer(found_path, allocator);
}
if (common_flags()->allow_addr2line) {
if (const char *found_path = FindPathToBinary("addr2line")) {
VReport(2, "Using addr2line found at: %s\n", found_path);
return new(*allocator) Addr2LinePool(found_path, allocator);
}
}
return nullptr;
}
在 __sanitizer::ChooseExternalSymbolizer ()
这个函数里,AddressSanitizer 会尝试根据 common_flags()->external_symbolizer_path
等值确定 symbolizer 程序的路径。我们可以看到,这里的 common_flags()->external_symbolizer_path
的实际值为 %d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer
,上面看到的 SymbolizerProcess
对象的 path_
即是根据这个值算出来的。
在 llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_flags.h
文件中,common_flags()
函数的定义为:
// Functions to get/set global CommonFlags shared by all sanitizer runtimes:
extern CommonFlags common_flags_dont_use;
inline const CommonFlags *common_flags() {
return &common_flags_dont_use;
}
inline void SetCommonFlagsDefaults() {
common_flags_dont_use.SetDefaults();
}
// This function can only be used to setup tool-specific overrides for
// CommonFlags defaults. Generally, it should only be used right after
// SetCommonFlagsDefaults(), but before ParseCommonFlagsFromString(), and
// only during the flags initialization (i.e. before they are used for
// the first time).
inline void OverrideCommonFlags(const CommonFlags &cf) {
common_flags_dont_use.CopyFrom(cf);
}
即 common_flags()
函数返回的是一个全局对象。这个全局对象的值,主要由 llvm-project/compiler-rt/lib/asan/asan_flags.cpp
文件中的 InitializeFlags()
函数来更新,这个函数的定义如下:
void InitializeFlags() {
// Set the default values and prepare for parsing ASan and common flags.
SetCommonFlagsDefaults();
{
CommonFlags cf;
cf.CopyFrom(*common_flags());
cf.detect_leaks = cf.detect_leaks && CAN_SANITIZE_LEAKS;
cf.external_symbolizer_path = GetEnv("ASAN_SYMBOLIZER_PATH");
cf.malloc_context_size = kDefaultMallocContextSize;
cf.intercept_tls_get_addr = true;
cf.exitcode = 1;
OverrideCommonFlags(cf);
}
Flags *f = flags();
f->SetDefaults();
FlagParser asan_parser;
RegisterAsanFlags(&asan_parser, f);
RegisterCommonFlags(&asan_parser);
// Set the default values and prepare for parsing LSan and UBSan flags
// (which can also overwrite common flags).
#if CAN_SANITIZE_LEAKS
__lsan::Flags *lf = __lsan::flags();
lf->SetDefaults();
FlagParser lsan_parser;
__lsan::RegisterLsanFlags(&lsan_parser, lf);
RegisterCommonFlags(&lsan_parser);
#endif
#if CAN_SANITIZE_UB
__ubsan::Flags *uf = __ubsan::flags();
uf->SetDefaults();
FlagParser ubsan_parser;
__ubsan::RegisterUbsanFlags(&ubsan_parser, uf);
RegisterCommonFlags(&ubsan_parser);
#endif
if (SANITIZER_MAC) {
// Support macOS MallocScribble and MallocPreScribble:
// <https://developer.apple.com/library/content/documentation/Performance/
// Conceptual/ManagingMemory/Articles/MallocDebug.html>
if (GetEnv("MallocScribble")) {
f->max_free_fill_size = 0x1000;
}
if (GetEnv("MallocPreScribble")) {
f->malloc_fill_byte = 0xaa;
}
}
// Override from ASan compile definition.
const char *asan_compile_def = MaybeUseAsanDefaultOptionsCompileDefinition();
asan_parser.ParseString(asan_compile_def);
// Override from user-specified string.
const char *asan_default_options = __asan_default_options();
asan_parser.ParseString(asan_default_options);
#if CAN_SANITIZE_UB
const char *ubsan_default_options = __ubsan_default_options();
ubsan_parser.ParseString(ubsan_default_options);
#endif
#if CAN_SANITIZE_LEAKS
const char *lsan_default_options = __lsan_default_options();
lsan_parser.ParseString(lsan_default_options);
#endif
// Override from command line.
asan_parser.ParseStringFromEnv("ASAN_OPTIONS");
#if CAN_SANITIZE_LEAKS
lsan_parser.ParseStringFromEnv("LSAN_OPTIONS");
#endif
#if CAN_SANITIZE_UB
ubsan_parser.ParseStringFromEnv("UBSAN_OPTIONS");
#endif
InitializeCommonFlags();
// TODO(eugenis): dump all flags at verbosity>=2?
if (Verbosity()) ReportUnrecognizedFlags();
if (common_flags()->help) {
// TODO(samsonov): print all of the flags (ASan, LSan, common).
asan_parser.PrintFlagDescriptions();
}
// Flag validation:
if (!CAN_SANITIZE_LEAKS && common_flags()->detect_leaks) {
Report("%s: detect_leaks is not supported on this platform.\n",
SanitizerToolName);
Die();
}
// Ensure that redzone is at least ASAN_SHADOW_GRANULARITY.
if (f->redzone < (int)ASAN_SHADOW_GRANULARITY)
f->redzone = ASAN_SHADOW_GRANULARITY;
// Make "strict_init_order" imply "check_initialization_order".
// TODO(samsonov): Use a single runtime flag for an init-order checker.
if (f->strict_init_order) {
f->check_initialization_order = true;
}
CHECK_LE((uptr)common_flags()->malloc_context_size, kStackTraceMax);
CHECK_LE(f->min_uar_stack_size_log, f->max_uar_stack_size_log);
CHECK_GE(f->redzone, 16);
CHECK_GE(f->max_redzone, f->redzone);
CHECK_LE(f->max_redzone, 2048);
CHECK(IsPowerOfTwo(f->redzone));
CHECK(IsPowerOfTwo(f->max_redzone));
// quarantine_size is deprecated but we still honor it.
// quarantine_size can not be used together with quarantine_size_mb.
if (f->quarantine_size >= 0 && f->quarantine_size_mb >= 0) {
Report("%s: please use either 'quarantine_size' (deprecated) or "
"quarantine_size_mb, but not both\n", SanitizerToolName);
Die();
}
if (f->quarantine_size >= 0)
f->quarantine_size_mb = f->quarantine_size >> 20;
if (f->quarantine_size_mb < 0) {
const int kDefaultQuarantineSizeMb =
(ASAN_LOW_MEMORY) ? 1UL << 4 : 1UL << 8;
f->quarantine_size_mb = kDefaultQuarantineSizeMb;
}
if (f->thread_local_quarantine_size_kb < 0) {
const u32 kDefaultThreadLocalQuarantineSizeKb =
// It is not advised to go lower than 64Kb, otherwise quarantine batches
// pushed from thread local quarantine to global one will create too
// much overhead. One quarantine batch size is 8Kb and it holds up to
// 1021 chunk, which amounts to 1/8 memory overhead per batch when
// thread local quarantine is set to 64Kb.
(ASAN_LOW_MEMORY) ? 1 << 6 : FIRST_32_SECOND_64(1 << 8, 1 << 10);
f->thread_local_quarantine_size_kb = kDefaultThreadLocalQuarantineSizeKb;
}
if (f->thread_local_quarantine_size_kb == 0 && f->quarantine_size_mb > 0) {
Report("%s: thread_local_quarantine_size_kb can be set to 0 only when "
"quarantine_size_mb is set to 0\n", SanitizerToolName);
Die();
}
if (!f->replace_str && common_flags()->intercept_strlen) {
Report("WARNING: strlen interceptor is enabled even though replace_str=0. "
"Use intercept_strlen=0 to disable it.");
}
if (!f->replace_str && common_flags()->intercept_strchr) {
Report("WARNING: strchr* interceptors are enabled even though "
"replace_str=0. Use intercept_strchr=0 to disable them.");
}
if (!f->replace_str && common_flags()->intercept_strndup) {
Report("WARNING: strndup* interceptors are enabled even though "
"replace_str=0. Use intercept_strndup=0 to disable them.");
}
}
在 InitializeFlags()
函数中,首先会给 CommonFlags common_flags_dont_use
设置默认值,随后会从环境变量里获取一些值来更新,即我们配置的环境变量 ASAN_SYMBOLIZER_PATH
,之后依次根据从 MaybeUseAsanDefaultOptionsCompileDefinition()
、__asan_default_options()
等函数中,以及从 ASAN_OPTIONS
等环境变量中获取选项,来覆盖前面的设置。
在这里,我们打印从环境变量 ASAN_SYMBOLIZER_PATH
获取的值,发现它就是我们配置的值 /usr/bin/llvm-symbolizer-11
。llvm-project/compiler-rt/lib/asan/asan_flags.cpp
文件中 MaybeUseAsanDefaultOptionsCompileDefinition()
函数的定义如下:
static const char *MaybeUseAsanDefaultOptionsCompileDefinition() {
#ifdef ASAN_DEFAULT_OPTIONS
return SANITIZER_STRINGIFY(ASAN_DEFAULT_OPTIONS);
#else
return "";
#endif
}
llvm-project/compiler-rt/lib/asan/asan_flags.cpp
文件中 __asan_default_options()
函数的定义如下:
SANITIZER_INTERFACE_WEAK_DEF(const char*, __asan_default_options, void) {
return "";
}
直观地看,这两个函数返回的配置选项不会更新 common_flags()->external_symbolizer_path
。但实际上,经过了对 __asan_default_options()
函数的返回值的处理之后,common_flags()->external_symbolizer_path
的值被更新为了 %d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer
。且 __asan_default_options()
函数实际返回的字符串也不是上面我们看到的 __asan_default_options()
函数定义中的空字符串,而是如下这个字符串:
check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ fast_unwind_on_fatal=1 detect_stack_use_after_return=1 symbolize=1 detect_leaks=0 allow_user_segv_handler=1 external_symbolizer_path=%d/../../third_party/llvm-build/Release+Asserts/bin/llvm-symbolizer
我们把我们的可执行文件丢进 GDB
中跑,并给 __asan_default_options()
函数加个断点。令我们惊讶的是,断点的位置并没有被加在 llvm-project/compiler-rt/lib/asan/asan_flags.cpp
文件中,而是加在了 WebRTC 的代码中 webrtc/build/sanitizers/sanitizer_options.cc
:
(gdb) break __asan_default_options
warning: Could not find DWO CU obj/build/config/sanitizers/options_sources/sanitizer_options.dwo(0x4d51bcd290d078c0) referenced by CU at offset 0x43f087 [in module /media/data/multimedia/OpenRTCClient/build/linux/x64/debug/loop_connect]
Breakpoint 1 at 0x81c6e34: file ../../../../webrtc/build/sanitizers/sanitizer_options.cc, line 75.
再来审视一下 __asan_default_options()
函数的声明和定义,发现在 llvm 中它被定义为了一个弱符号。而在 WebRTC 的代码 webrtc/build/sanitizers/sanitizer_options.cc
中有 __asan_default_options()
函数的定义如下:
#if defined(ADDRESS_SANITIZER)
// Default options for AddressSanitizer in various configurations:
// check_printf=1 - check the memory accesses to printf (and other formatted
// output routines) arguments.
// use_sigaltstack=1 - handle signals on an alternate signal stack. Useful
// for stack overflow detection.
// strip_path_prefix=/../../ - prefixes up to and including this
// substring will be stripped from source file paths in symbolized reports
// fast_unwind_on_fatal=1 - use the fast (frame-pointer-based) stack unwinder
// to print error reports. V8 doesn't generate debug info for the JIT code,
// so the slow unwinder may not work properly.
// detect_stack_use_after_return=1 - use fake stack to delay the reuse of
// stack allocations and detect stack-use-after-return errors.
// symbolize=1 - enable in-process symbolization.
// external_symbolizer_path=... - provides the path to llvm-symbolizer
// relative to the main executable
#if defined(OS_LINUX) || defined(OS_CHROMEOS)
const char kAsanDefaultOptions[] =
"check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ "
"fast_unwind_on_fatal=1 detect_stack_use_after_return=1 "
"symbolize=1 detect_leaks=0 allow_user_segv_handler=1 "
"external_symbolizer_path=%d/../../third_party/llvm-build/Release+Asserts/"
"bin/llvm-symbolizer";
#elif defined(OS_APPLE)
const char* kAsanDefaultOptions =
"check_printf=1 use_sigaltstack=1 strip_path_prefix=/../../ "
"fast_unwind_on_fatal=1 detect_stack_use_after_return=1 ";
#elif defined(OS_WIN)
const char* kAsanDefaultOptions =
"check_printf=1 use_sigaltstack=1 strip_path_prefix=\\..\\..\\ "
"fast_unwind_on_fatal=1 detect_stack_use_after_return=1 "
"symbolize=1 external_symbolizer_path=%d/../../third_party/"
"llvm-build/Release+Asserts/bin/llvm-symbolizer.exe";
#endif // defined(OS_LINUX) || defined(OS_CHROMEOS)
#if defined(OS_LINUX) || defined(OS_CHROMEOS) || defined(OS_APPLE) || \
defined(OS_WIN)
// Allow NaCl to override the default asan options.
extern const char* kAsanDefaultOptionsNaCl;
__attribute__((weak)) const char* kAsanDefaultOptionsNaCl = nullptr;
SANITIZER_HOOK_ATTRIBUTE const char *__asan_default_options() {
if (kAsanDefaultOptionsNaCl)
return kAsanDefaultOptionsNaCl;
return kAsanDefaultOptions;
}
extern char kASanDefaultSuppressions[];
SANITIZER_HOOK_ATTRIBUTE const char *__asan_default_suppressions() {
return kASanDefaultSuppressions;
}
#endif // defined(OS_LINUX) || defined(OS_CHROMEOS) || defined(OS_APPLE) ||
// defined(OS_WIN)
#endif // ADDRESS_SANITIZER
至此不难确认,我们通过环境变量 ASAN_SYMBOLIZER_PATH
配置的 symbolizer,被 WebRTC 的代码中的配置选项给覆盖了。
WebRTC 中相关的改动是 https://chromium.googlesource.com/chromium/src/build/+/919d061c2f455cc07b687a48322785b3b61f1455%5E%21/sanitizers/sanitizer_options.cc
这个 commit 提交的。
对于这个问题,解决方案也不难确认,把 WebRTC 的代码 webrtc/build/sanitizers/sanitizer_options.cc
中,配置 AddressSanitizer 的 symbolizer 的部分给去掉即可。
参考文档