概述
libFuzzer 是一个in-process
,coverage-guided
,evolutionary
的 fuzz
引擎,是 LLVM
项目的一部分。
libFuzzer
和 要被测试的库 链接在一起,通过一个模糊测试入口点(目标函数),把测试用例喂给要被测试的库。
fuzzer
会跟踪哪些代码区域已经测试过,然后在输入数据的语料库上进行变异,来使代码覆盖率最大化。代码覆盖率的信息由 LLVM
的SanitizerCoverage
插桩提供。
安装
git clone https://github.com/Dor1s/libfuzzer-workshop.git
sudo ln -s /usr/include/asm-generic /usr/include/asm
apt-get install gcc-multilib
然后进入
libfuzzer-workshop/
, 执行checkout_build_install_llvm.sh
安装好llvm
.然后进入
libfuzzer-workshop/libFuzzer/Fuzzer/
,执行build.sh
编译好libFuzzer
。如果编译成功,会生成
libfuzzer-workshop/libFuzzer/Fuzzer/libFuzzer.a
中间编译llvm如果报的错误是internal错误,可能是机器内存不够,可通过设置内存大小和swap分区解决。
Lesson
01-04
- Modern_Fuzzing_of_C_C++_projects_slides_1-23
简单介绍了下单元测试和fuzz以及modern fuzz。
使用radamsa随机调用seed库,实现对pdfium的简单fuzz
介绍了libfuzzer、覆盖率、常见的memtools (AddressSanitizer,MemorySanitizer,UndefinedBehaviorSanitizer)
简单介绍几个vul函数的fuzz
VulnerableFunction1
bool VulnerableFunction1(const uint8_t* data, size_t size) {
bool result = false;
if (size >= 3) {
result = data[0] == 'F' &&
data[1] == 'U' &&
data[2] == 'Z' &&
data[3] == 'Z';
}
return result;
}
data 是缓冲区,size是其大小,当其size大于等于3的时候,访问data[3] 会造成越界访问
Compile the fuzzer in the following way:
clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \
first_fuzzer.cc ../../libFuzzer/libFuzzer.a \
-o first_fuzzer
这里注意路径调整一下
Create an empty directory for corpus and run the fuzzer:
mkdir corpus1
./first_fuzzer corpus1
VulnerableFunction2
template<class T>
typename T::value_type DummyHash(const T& buffer) {
typename T::value_type hash = 0;
for (auto value : buffer)
hash ^= value;
return hash;
}
constexpr auto kMagicHeader = "ZN_2016";
constexpr std::size_t kMaxPacketLen = 1024;
constexpr std::size_t kMaxBodyLength = 1024 - sizeof(kMagicHeader);
bool VulnerableFunction2(const uint8_t* data, size_t size, bool verify_hash) {
if (size < sizeof(kMagicHeader))
return false;
std::string header(reinterpret_cast<const char*>(data), sizeof(kMagicHeader));
std::array<uint8_t, kMaxBodyLength> body; // 申请的数组长度为 1024 - sizeof(kMagicHeader)
if (strcmp(kMagicHeader, header.c_str())) // 比较前缀是不是ZN_2016
return false;
auto target_hash = data[--size];
if (size > kMaxPacketLen)
return false;
if (!verify_hash)
return true;
std::copy(data, data + size, body.data()); // 可以很明显看到这里可能存在溢出
auto real_hash = DummyHash(body);
return real_hash == target_hash;
}
可以看到这个漏洞函数,fuzz程序如下:
// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
bool flag[2] = {false,true};
for (auto f : flag) // 如果不遍历这个bool类型的话,直接传递false跑不出crash
VulnerableFunction2(data, size,f);
return 0;
}
如果我们设置一下条件,可以更快的跑出crash
Address 0x7ffedfe60ca8 is located in stack of thread T0 at offset 1128 in frame
#0 0x4f801f in VulnerableFunction2(unsigned char const*, unsigned long, bool) /home/nevv/libfuzzer-workshop/lessons/04/./vulnerable_functions.h:42
This frame has 3 object(s):
[32, 64) 'header' (line 46)
[96, 97) 'ref.tmp' (line 46)
[112, 1128) 'body' (line 48) <== Memory access at offset 1128 overflows this variabl
可以看到是body这里溢出了,如果我们一开始不设置最大长度的话,可能fuzz很久都没有crash(路径就这么多,没有找到触发crash的路径)
libfuzzer运行参数
http://llvm.org/docs/LibFuzzer.html#running
copy函数
//fist [IN]: 要拷贝元素的首地址
//last [IN]:要拷贝元素的最后一个元素的下一个地址
//x [OUT] : 拷贝的目的地的首地址
template<class InIt, class OutIt>
OutIt copy(InIt first, InIt last, OutIt x);
VulnerableFunction3
constexpr std::size_t kZn2016VerifyHashFlag = 0x0001000;
bool VulnerableFunction3(const uint8_t* data, size_t size, std::size_t flags) {
bool verify_hash = flags & kZn2016VerifyHashFlag;
return VulnerableFunction2(data, size, verify_hash);
}
直接跟之前一样制定下hash:
// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include <stdint.h>
#include <stddef.h>
#include "vulnerable_functions.h"
#include <functional>
#include <string>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
std::string data_string(reinterpret_cast<const char*>(data), size);
auto data_hash = std::hash<std::string>()(data_string);
std::size_t flags = static_cast<size_t>(data_hash);
VulnerableFunction3(data, size, flags);
return 0;
}
05 openssl heartbleed漏洞
漏洞简介
请看ssl/dl_both.c,漏洞的补丁从这行语句开始:
int
dtls1_process_heartbeat(SSL s)
{
unsigned char p = &s->s3->rrec.data[0], pl;
unsigned short hbtype;
unsigned int payload;
unsigned int padding = 16; / Use minimum padding /
一上来我们就拿到了一个指向一条SSLv3记录中数据的指针。结构体SSL3_RECORD的定义如下
typedef struct ssl3_record_st
{
int type; / type of record /
unsigned int length; / How many bytes available /
unsigned int off; / read/write offset into 'buf' /
unsigned char data; / pointer to the record data /
unsigned char input; / where the decode bytes are /
unsigned char comp; / only used with decompression - malloc()ed /
unsigned long epoch; / epoch number, needed by DTLS1 /
unsigned char seq_num[8]; / sequence number, needed by DTLS1 /
} SSL3_RECORD;
每条SSLv3记录中包含一个类型域(type)、一个长度域(length)和一个指向记录数据的指针(data)。我们回头去看dtls1_process_heartbeat:
/ Read type and payload length first /
hbtype = p++;
n2s(p, payload);
pl = p;
SSLv3记录的第一个字节标明了心跳包的类型。宏n2s从指针p指向的数组中取出前两个字节,并把它们存入变量payload中——这实际上是心跳包载荷的长度域(length)。注意程序并没有检查这条SSLv3记录的实际长度。变量pl则指向由访问者提供的心跳包数据。
这个函数的后面进行了以下工作:
unsigned char buffer, bp;
int r;
/ Allocate memory for the response, size is 1 byte
message type, plus 2 bytes payload length, plus
payload, plus padding
/
buffer = OPENSSL_malloc(1 + 2 + payload + padding);
bp = buffer;
所以程序将分配一段由访问者指定大小的内存区域,这段内存区域最大为 (65535 + 1 + 2 + 16) 个字节。变量bp是用来访问这段内存区域的指针。
/ Enter response type, length and copy payload /
bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy(bp, pl, payload);
宏s2n与宏n2s干的事情正好相反:s2n读入一个16 bit长的值,然后将它存成双字节值,所以s2n会将与请求的心跳包载荷长度相同的长度值存入变量payload。然后程序从pl处开始复制payload个字节到新分配的bp数组中——pl指向了用户提供的心跳包数据。
本质上是openssl处理心跳包的时候对于解析出来的用户可控数据包长度字段没有进行检查,后续的写入导致有可能将server端的数据写入到返回数据包中返回给用户。
fuzz程序
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <assert.h>
#include <stdint.h>
#include <stddef.h>
#ifndef CERT_PATH
# define CERT_PATH
#endif
SSL_CTX *Init() {
SSL_library_init();
SSL_load_error_strings();
ERR_load_BIO_strings();
OpenSSL_add_all_algorithms();
SSL_CTX *sctx;
assert (sctx = SSL_CTX_new(TLSv1_method()));
/* These two file were created with this command:
openssl req -x509 -newkey rsa:512 -keyout server.key \
-out server.pem -days 9999 -nodes -subj /CN=a/
*/
assert(SSL_CTX_use_certificate_file(sctx, CERT_PATH "server.pem",
SSL_FILETYPE_PEM));
assert(SSL_CTX_use_PrivateKey_file(sctx, CERT_PATH "server.key",
SSL_FILETYPE_PEM));
return sctx;
}
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
static SSL_CTX *sctx = Init();
SSL *server = SSL_new(sctx);
BIO *sinbio = BIO_new(BIO_s_mem());
BIO *soutbio = BIO_new(BIO_s_mem());
SSL_set_bio(server, sinbio, soutbio);
SSL_set_accept_state(server);
BIO_write(sinbio, data, size);
SSL_do_handshake(server);
SSL_free(server);
return 0;
}
06 c_ares 漏洞
// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include <stdint.h>
#include <stdlib.h>
#include <arpa/nameser.h>
#include <string>
#include <ares.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
unsigned char *buf;
int buflen;
std::string s(reinterpret_cast<const char *>(data), size);
ares_create_query(s.c_str(), ns_c_in, ns_t_a, 0x1234, 0, &buf, &buflen, 0);
ares_free_string(buf);
return 0;
}
用 libfuzzer
的话,我们需要做的工作就是根据目标程序的逻辑,把 libfuzzer
生成的 测试数据 传递 给 目标程序去处理, 然后在编译时采取合适的 Sanitizer
用于检测运行时出现的内存错误就好。抽空还是需要看一下源码以及基于libfuzzer的相关论文~