当你要求在angr中执行一个步骤时,有些东西必须实际执行这个步骤。angr使用一系列引擎(SimEngine类的子类)来模拟给定代码段对输入状态的影响。angr的执行核心只是按顺序尝试所有可用的引擎,选择第一个能够处理该步骤的引擎。以下是引擎的默认列表,顺序如下:
- 当前一个步骤将我们带到某些不可持续的状态时,故障引擎就会启动
- 当上一步在系统调用中结束时,系统调用引擎就会启动
- hook引擎在hook住当前地址时启动
- 当unicorn状态选项被启用,并且状态中没有符号数据时,unicorn引擎就启动了
- VEX引擎作为最后的后备。
SimSuccessors
实际上依次尝试所有引擎的代码是project.factory.successors(state, **kwargs),它将自己的参数传递给每个引擎。这个函数位于state.step()和simulation_manager.step()的核心。它返回一个sim对象,我们在前面简要讨论过。sim的目的是对存储在各种列表属性中的后续状态执行简单的分类。它们是:
Attribute | Guard Condition | Instruction Pointer | Description |
---|---|---|---|
successors | True(可以是符号化的,但限制为真的) | 可以是符号的(但是256个或更少的解;见unconstrained_successors)。 | 引擎处理状态的一种正常的、可满足的后续状态。该状态的指令指针可以是符号的(即,基于用户输入的计算跳转),因此该状态实际上可能表示几个潜在的执行继续。 |
unsat_successors | False(可以是符号化的,但限制为False) | 可以是符号的 | 不可满足的继任者。这些后继者的守卫条件只能为假(例如,不能执行的跳跃,或者必须执行的跳跃的默认分支)。 |
flat_successors | True(可以是符号化的,但限制为真的) | 具体的值。 | 如上所述,继任者名单中的状态可以有符号指令指针。这是相当令人困惑的,在代码的其他地方(如SimEngineVEX.process,当该状态向前步进时),我们假设单个程序状态只表示代码中单个点的执行。为了缓解这种情况,当我们遇到带有符号指令指针的后续状态时,我们为它们计算所有可能的具体解决方案(最高可达256个任意阈值),并为每个这样的解决方案创建状态副本。我们称这个过程为“扁平化”。这些flat_successors是状态,每个状态都有一个不同的、具体的指令指针。举个例子,如果一个状态的指令指针在successors 是X + 5,其中X有 X > 0x800000约束和X < = 0x800010,扁平成16个不同flat_successors状态,指令指针是0x800006,0x800007,直到0 x800015。 |
unconstrained_successors | True(可以是符号化的,但限制为真的) | 符号(超过256个解决方案)。 | 在上述扁平化过程中,如果指令指针有超过256种可能的解决方案,我们假设指令指针被非约束数据覆盖(即用户数据的堆栈溢出)。这种假设一般来说是不合理的。这种国家被置于不受限制的继承人手中,而不是置于继承人手中。 |
all_successors | anything | 可以是象征性的 | 是successors + unsat_successors + unconstrained_successors. |
断点
待办事项:重写这个以修正叙述
像任何像样的执行引擎一样,angr支持断点。这太酷了!设置一个点的方法如下:
>>> import angr
>>> b = angr.Project('examples/fauxware/fauxware')
# get our state
>>> s = b.factory.entry_state()
# add a breakpoint. This breakpoint will drop into ipdb right before a memory write happens.
>>> s.inspect.b('mem_write')
# on the other hand, we can have a breakpoint trigger right *after* a memory write happens.
# we can also have a callback function run instead of opening ipdb.
>>> def debug_func(state):
... print("State %s is about to do a memory write!")
>>> s.inspect.b('mem_write', when=angr.BP_AFTER, action=debug_func)
# or, you can have it drop you in an embedded IPython!
>>> s.inspect.b('mem_write', when=angr.BP_AFTER, action=angr.BP_IPYTHON)
除了内存写,还有许多其他地方可以断点。这是清单。对于每个事件,您可以在BP_BEFORE或BP_AFTER处中断。
Event type | Event meaning |
---|---|
mem_read | Memory is being read. |
mem_write | Memory is being written. |
address_concretization | A symbolic memory access is being resolved. |
reg_read | A register is being read. |
reg_write | A register is being written. |
tmp_read | A temp is being read. |
tmp_write | A temp is being written. |
expr | An expression is being created (i.e., a result of an arithmetic operation or a constant in the IR). |
statement | An IR statement is being translated. |
instruction | A new (native) instruction is being translated. |
irsb | A new basic block is being translated. |
constraints | New constraints are being added to the state. |
exit | A successor is being generated from execution. |
fork | A symbolic execution state has forked into multiple states. |
symbolic_variable | A new symbolic variable is being created. |
call | A call instruction is hit. |
return | A ret instruction is hit. |
simprocedure | A simprocedure (or syscall) is executed. |
dirty | A dirty IR callback is executed. |
syscall | A syscall is executed (called in addition to the simprocedure event). |
engine_process | A SimEngine is about to process some code. |
这些事件暴露了不同的属性:
见原文。
这些属性可以作为状态的成员访问。在适当的断点回调期间进行检查,以访问适当的值。您甚至可以修改这些值以修改这些值的进一步使用!
>>> def track_reads(state):
... print('Read', state.inspect.mem_read_expr, 'from', state.inspect.mem_read_address)
...
>>> s.inspect.b('mem_read', when=angr.BP_AFTER, action=track_reads)
此外,这些属性中的每一个都可以用作要检查的关键字参数。B使断点具有条件:
# This will break before a memory write if 0x1000 is a possible value of its target expression
>>> s.inspect.b('mem_write', mem_write_address=0x1000)
# This will break before a memory write if 0x1000 is the *only* value of its target expression
>>> s.inspect.b('mem_write', mem_write_address=0x1000, mem_write_address_unique=True)
# This will break after instruction 0x8000, but only 0x1000 is a possible value of the last expression that was read from memory
>>> s.inspect.b('instruction', when=angr.BP_AFTER, instruction=0x8000, mem_read_expr=0x1000)
酷炫的东西!事实上,我们甚至可以将函数指定为条件:
# this is a complex condition that could do anything! In this case, it makes sure that RAX is 0x41414141 and
# that the basic block starting at 0x8004 was executed sometime in this path's history
>>> def cond(state):
... return state.eval(state.regs.rax, cast_to=str) == 'AAAA' and 0x8004 in state.inspect.backtrace
>>> s.inspect.b('mem_write', condition=cond)
这是一些很酷的东西!
注意mem_read断点
每当执行程序或二进制分析进行内存读取时,都会触发mem_read断点。如果你在mem_read上使用断点,同时也使用state.mem从内存地址加载数据,然后知道断点将被触发,因为您正在技术上读取内存。
因此,如果你想从内存加载数据,而不触发任何你已经设置的mem_read断点,使用state.memory.load关键字参数disable_actions=True和inspect=False。
状态也是如此。Find和您可以使用相同的关键字参数来防止触发mem_read断点。