This page has been machine-translated from the original page.
This article was written for CTF Advent Calendar 2023.
The previous article was Edwow Math (N30Z30N)‘s ”Good CTFs, Bad CTFs, Ordinary CTFs (for CTF Advent Calendar 2023) - Learning cyber security by playing and enjoying CTFs.”
As for the next article… unfortunately, it looks like nobody is scheduled to write one at the moment, so I will skip that part.
This time, I will try using angr to hook arbitrary symbol functions so that I can partially modify their behavior or override them entirely.
Table of Contents
Using the hook_symbol Method in angr
In angr, you can hook specific symbol functions by using the hook_symbol method.
The basic way to override a symbol function with hook_symbol is to create any class that inherits from angr.SimProcedure, then call it while specifying the symbol name of the function you want to hook through its run method.
At that time, you can also pass arbitrary arguments to the run method.
from angr import Project, SimProcedure
project = Project('examples/fauxware/fauxware')
class BugFree(SimProcedure):
def run(self, argc, argv):
print('Program running with argc=%s and argv=%s' % (argc, argv))
return 0
# this assumes we have symbols for the binary
project.hook_symbol('main', BugFree())
# Run a quick execution!
simgr = project.factory.simulation_manager()
simgr.run()Reference: Hooks and SimProcedures - angr documentation
By using SimProcedures and hook_symbol, you can, for example, override the behavior of arbitrary functions or library functions and recover the Flag.
Overriding the Return Value of a Symbol Function in angr to Recover the Flag
First, I will use angr to analyze a binary built from the following C code and recover the Flag.
The program compiled from the code below takes a string from standard input and prints Success only if the input is Flag{angr} and the value generated by the rand() function is exactly 0x12345.
// gcc sample1.c -o chal.bin
#include <stdio.h>
#include <string.h>
#include <time.h>
void main()
{
char flag[16];
scanf("%15s", flag);
srand(time(NULL));
if ((strcmp(flag, "Flag{angr}") == 0) && (rand() == 0x12345)) {
printf("Success\n");
} else {
printf("Failed\n");
}
return 0;
}If you run it normally, the return value of rand() will almost never be exactly 0x12345, so the output is always Failed.
In other words, unless the constraint “the return value of rand() is exactly 0x12345” is satisfied, angr cannot identify the correct Flag.
In a case like this, you can recover the Flag by using a script like the following to hook rand and override its return value to 0x12345.
In the example below, after setting the bitvector defined by the symbolic variable flag as standard input, I override the rand function with hook_symbol, which makes it easy to determine that the correct Flag is Flag{angr}.
import angr
import claripy
from logging import getLogger, WARN
getLogger("angr").setLevel(WARN + 1)
class OverrideFunction(angr.SimProcedure):
def run(self, argc, argv):
data = (0x12345).to_bytes(4, byteorder='little')
data = int.from_bytes(data, byteorder='big')
return claripy.BVV(data, 32)
def correct(state):
if b"Success" in state.posix.dumps(1):
return True
return False
def failed(state):
if b"Failed" in state.posix.dumps(1):
return True
return False
flag = claripy.BVS('flag', 16*8, explicit_name=True)
proj = angr.Project("./chal.bin", load_options={"auto_load_libs": False})
state = proj.factory.entry_state(stdin=flag)
simgr = proj.factory.simulation_manager(state)
simgr.explore(find=correct, avoid=failed)
proj.hook_symbol("rand", OverrideFunction())
try:
found = simgr.found[0]
# print(found.posix.dumps(0))
print(found.solver.eval(flag, cast_to=bytes))
except IndexError:
print("Not Found")Reference: Trying Various Things with angr [yoshi-camp notes] - Let’s Do CTF
Here, the rand function is replaced with the behavior defined in the run method of the OverrideFunction class.
When returning an integer, it seemed necessary to create a bitvector whose bytes are effectively in big-endian order, so the code below first converts the integer 0x12345 into bytes and then reads those bytes back as a big-endian integer before constructing the bitvector.
data = (0x12345).to_bytes(4, byteorder='little')
data = int.from_bytes(data, byteorder='big')
claripy.BVV(data, 32)When you run this, the symbolic variable flag at the moment Success is ultimately printed is dumped, and you can identify the correct Flag.
Replacing the Behavior of Statically Linked Library Functions
It seems that angr bundles replacement implementations of common library functions such as those in libc.
According to the angr documentation, angr uses these replacements so that even dynamically linked programs can be analyzed correctly.
Reference: Extending the Environment Model - angr documentation
Here, the blog post below mentions that when library functions are statically linked into a binary, angr analyzes those linked library functions too, which slows analysis down.
In such cases, it seems possible to speed up analysis by binding library-function symbols to the replacement implementations bundled with angr, using the hook_symbol method as shown below.
p.hook_symbol("__libc_start_main", angr.SIM_PROCEDURES["glibc"]["__libc_start_main"]())
p.hook_symbol("printf", angr.procedures.libc.printf.printf())
p.hook_symbol("__isoc99_scanf", angr.procedures.libc.scanf.scanf())
p.hook_symbol("strcmp", angr.procedures.libc.strcmp.strcmp())
p.hook_symbol("puts", angr.procedures.libc.puts.puts())Reference: Trying Various Things with angr [yoshi-camp notes] - Let’s Do CTF
Using the hook Method in angr
When using the SimProcedure described above, it seems that the entire function is hooked during analysis.
On the other hand, if you use a user hook, you can hook a specific location in the code and do things such as rewriting registers.
Reference: Hooks and SimProcedures - angr documentation
Rewriting a Specific Register with the hook Method
Now I will actually try angr’s hook method on a binary compiled from the following code.
// gcc sample2.c -o chal.bin
#include <stdio.h>
#include <time.h>
int func()
{
return 0;
}
void main()
{
int v = func();
char flag[16];
scanf("%15s", flag);
if ((strcmp(flag, "Flag{angr}") == 0) && (v == 1)) {
printf("Success\n");
} else {
printf("Failed\n");
}
return 0;
}The program above checks whether the input string is Flag{angr}, but if the value of v is anything other than 1, Success will not be printed even when the correct Flag is entered.
And the value of v is always 0 because of the func function.
If you want to recover the Flag from a binary like this with angr, you could of course do it the same way as before by overriding func with the hook_symbol method.
This time, however, I deliberately do not want to override the entire function, so I will try to recover the Flag by using the hook method to rewrite only the return-value register eax.
First, use the objdmp command or similar in advance to check the address of the code that stores the return value of func into the variable v.
11d8: e8 cc ff ff ff call 11a9 <func>
11dd: 89 45 dc mov %eax,-0x24(%rbp)Next, recover the Flag with the following script by using the hook method to rewrite only the return-value register eax.
The code below is similar to the previous example, but it overrides 5 bytes of processing starting at address 0x4011d8, where func is called, sets the value of the eax register to 1, and then resumes execution from 0x4011dd onward.
import angr
import claripy
from logging import getLogger, WARN
getLogger("angr").setLevel(WARN + 1)
def correct(state):
if b"Success" in state.posix.dumps(1):
return True
return False
def failed(state):
if b"Failed" in state.posix.dumps(1):
return True
return False
flag = claripy.BVS('flag', 16*8, explicit_name=True)
proj = angr.Project("./chal.bin", load_options={"auto_load_libs": False})
@proj.hook(0x4011d8, length=5)
def set_eax(state):
state.regs.eax = 1
state = proj.factory.entry_state(stdin=flag)
simgr = proj.factory.simulation_manager(state)
simgr.explore(find=correct, avoid=failed)
try:
found = simgr.found[0]
# print(found.posix.dumps(0))
print(found.solver.eval(flag, cast_to=bytes))
except IndexError:
print("Not Found")By running this script, you can get past the v == 1 check, so angr is able to recover the correct Flag.
Solving a CTF Challenge with hook_symbol
Finally, I will solve an actual CTF challenge using angr’s hook_symbol method.
The challenge I will solve with angr this time is SOP from Gracier CTF 2023.
Reference: Gracier CTF 2023 Writeup SOP(Rev)
The challenge itself is themed around SOP (probably Signal Oriented Programming or Sigreturn Oriented Programming). It begins by raising an exception at the end of the main function and then repeatedly calls handler functions defined with signal through the raise function.
Because the program’s actual behavior (checking whether the input is the correct Flag) starts from the exception raised after main ends, normal dynamic analysis is not possible even with gdb, so this was a challenge where you needed to identify the Flag using a Frida hook or static analysis.
This time, as an alternative solution, I will show how to recover the correct Flag by using angr’s hook_symbol to override the behavior of signal and raise, ignoring the binary’s actual behavior.
How to Recover the SOP Flag with angr
This binary is a program that checks whether the input string matches the correct Flag.
For that reason, it is the kind of challenge that would normally be relatively easy to solve with angr’s SimulationManager.
However, the Flag-verification logic in this binary starts from an exception after main finishes and is implemented by following the handler functions defined with signal, so you cannot solve it just by running SimulationManager as-is.
Therefore, after analyzing the binary with Ghidra to identify the relationship between the signal numbers passed as arguments to raise and the handler functions defined in the binary, I use the hook_symbol method to override the program’s behavior so that SimulationManager can identify the Flag.
This eliminates the need for angr to actually analyze the behavior of signal and raise, making it easy to identify the correct Flag.
Analyzing the Binary
To create a solver with angr, I first use Ghidra to identify the information needed to recover the Flag.
First, in the main function, you can see that the read function takes 0x44 bytes of input from standard input, and the function returns the result of comparing DAT_001061c8 != 0x44.
bool main(void)
{
size_t sVar1;
ssize_t sVar2;
sVar1 = strlen(&DAT_001060f0);
DAT_001061c8 = (int)sVar1;
if (DAT_001061c8 == 0) {
sVar2 = read(0,&DAT_001060f0,0x44);
DAT_001061c8 = (int)sVar2;
}
return DAT_001061c8 != 0x44;
}This main function is called from __libc_start_main, and the return value of main called by __libc_start_main is passed directly to the exit function.
Reference: _libcstart_main
In other words, when the result of the comparison DAT_001061c8 != 0x44 becomes False, the program is regarded as having terminated abnormally, and the initial exception is triggered.
Next, identify the relationship between signal numbers and handler functions.
Looking through the decompilation results of the functions in Ghidra, you can confirm that the following signal definitions are present.
For some reason there are multiple places that set a handler function for SIGSEGV (0xb), which suggests that the handler function is changed dynamically here.
signal(0xb,FUN_001011e0);
signal(0xb,FUN_00101bc0);
signal(0xb,FUN_00102e60);
signal(0xe,FUN_00101bc0);
signal(0x10,FUN_00101350);
signal(0x11,FUN_001014a0);
signal(0x12,FUN_00102fb0);
signal(0x15,FUN_00101550);
signal(0x16,FUN_00102520);Next, identify the result when the correct Flag or an incorrect Flag is entered.
This is easy to determine because if you just run the program with arbitrary input, it prints the string FAIL.
As shown above, the output is SUCCESS when the Flag is correct and FAIL when the Flag is incorrect.
Now let’s look more closely at the code where the Flag is verified.
if (0x40 < DAT_001061c8) {
DAT_001061c8 = DAT_001061c8 - 0x40;
DAT_00106210 = DAT_00106210 + 0x40;
DAT_001061c0 = DAT_001061c0 + 0x40;
raise(0xb);
return;
}
if (DAT_001061c8 < 0x40) {
for (DAT_001061cc = 0; DAT_001061cc < DAT_001061c8; DAT_001061cc = DAT_001061cc + 1) {
*(undefined *)(DAT_00106138 + (ulong)DAT_001061cc) = DAT_00106210[DAT_001061cc];
}
}
DAT_001060d0 = DAT_001061a4;
DAT_001060d4 = DAT_001061ac;
memcpy(local_58,&DAT_00104050,0x44);
memset(local_168,0,0x110);
local_16c = 0;
do {
do {
if (local_16c == 0x44) {
printf("SUCCESS\n");
/* WARNING: Subroutine does not return */
exit(0);
}
local_170 = 0;
getrandom(&local_170,4,2);
local_170 = local_170 % 0x44;
} while (local_168[local_170] == 1);
local_168[local_170] = 1;
local_16c = local_16c + 1;
} while ((&DAT_00106220)[local_170] == local_58[local_170]);
printf("FAIL\n");Ignore the first half and focus on the part below.
Here, after taking the random value obtained by the getrandom function modulo 0x44, the result is used as an index to check whether some transformed value derived from the input matches a hard-coded byte sequence used for verification.
do {
do {
**
local_170 = 0;
getrandom(&local_170,4,2);
local_170 = local_170 % 0x44;
} while (local_168[local_170] == 1);
local_168[local_170] = 1;
local_16c = local_16c + 1;
**
} while ((&DAT_00106220)[local_170] == local_58[local_170]);This is probably part of the dynamic-analysis countermeasures, and it also becomes a factor that interferes with analysis in angr.
Finally, use strace -e trace=signal to trace how the callback functions behave when you actually give the program a string of 0x44 bytes.
$ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | strace -e trace=signal ./app
rt_sigaction(SIGSEGV, {sa_handler=0x55a8e91ed1e0, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGSTKFLT, {sa_handler=0x55a8e91ed350, sa_mask=[STKFLT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGCHLD, {sa_handler=0x55a8e91ed4a0, sa_mask=[CHLD], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGCONT, {sa_handler=0x55a8e91eefb0, sa_mask=[CONT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xe91ef160} ---
tgkill(29240, 29240, SIGSTKFLT) = 0
--- SIGSTKFLT {si_signo=SIGSTKFLT, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
tgkill(29240, 29240, SIGCHLD) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
tgkill(29240, 29240, SIGCONT) = 0
--- SIGCONT {si_signo=SIGCONT, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
rt_sigaction(SIGSEGV, {sa_handler=0x55a8e91eee60, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=0x55a8e91ed1e0, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, 8) = 0
tgkill(29240, 29240, SIGSEGV) = 0
rt_sigreturn({mask=[SEGV STKFLT CHLD]}) = 0
rt_sigreturn({mask=[SEGV STKFLT]}) = 0
rt_sigreturn({mask=[SEGV]}) = 0
rt_sigreturn({mask=[]}) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
rt_sigaction(SIGTTOU, {sa_handler=0x55a8e91ee520, sa_mask=[TTOU], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGALRM, {sa_handler=0x55a8e91edbc0, sa_mask=[ALRM], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTTIN, {sa_handler=0x55a8e91ed550, sa_mask=[TTIN], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigaction(SIGSEGV, {sa_handler=0x55a8e91edbc0, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=0x55a8e91eee60, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, 8) = 0
tgkill(29240, 29240, SIGTTIN) = 0
--- SIGTTIN {si_signo=SIGTTIN, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
tgkill(29240, 29240, SIGTTOU) = 0
--- SIGTTOU {si_signo=SIGTTOU, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
tgkill(29240, 29240, SIGSEGV) = 0
rt_sigreturn({mask=[SEGV ALRM TTIN]}) = 0
rt_sigreturn({mask=[SEGV ALRM]}) = 0
rt_sigreturn({mask=[SEGV]}) = -1 EINTR (Interrupted system call)
rt_sigreturn({mask=[]}) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
rt_sigaction(SIGSEGV, {sa_handler=0x55a8e91edbc0, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, {sa_handler=0x55a8e91edbc0, sa_mask=[SEGV], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f047a881520}, 8) = 0
tgkill(29240, 29240, SIGTTIN) = 0
--- SIGTTIN {si_signo=SIGTTIN, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
tgkill(29240, 29240, SIGTTOU) = 0
--- SIGTTOU {si_signo=SIGTTOU, si_code=SI_TKILL, si_pid=29240, si_uid=1000} ---
FAIL
+++ exited with 1 +++From this result, you can see that SIGSEGV (0xb) is called first when the main function finishes, and the RVA at that time is 0x11e0.
After that, processing continues through SIGSTKFLT (0x10), SIGCHLD (0x11), and SIGCONT (0x12).
The signals called throughout the whole process were the following.
This matches the signals for which handler functions were defined in the signal calls checked earlier. (Only the handler function for SIGSEGV is reconfigured dynamically.)
SIGSEGV(0xb)
SIGALRM(0xe)
SIGSTKFLT(0x10)
SIGCHLD(0x11)
SIGCONT(0x12)
SIGTTIN(0x15)
SIGTTOU(0x16)One thing worth noticing here is that SIGALRM, which is not triggered by raise, is still being called.
It is a slightly tricky implementation, but by placing sleep(2) immediately after alarm(1), it seems to force SIGALRM to occur at that timing.
signal(0x16,FUN_00102520);
signal(0xe,FUN_00101bc0);
signal(0x15,FUN_00101550);
alarm(1);
sleep(2);In analysis that uses symbolic execution like angr, it may be difficult to track time-based state changes such as alarm and sleep accurately.
For that reason, on the solver side we need to override the sleep function and replace it with processing that forcibly calls the SIGALRM callback function.
Building a Solver with hook_symbol
Now that we have gathered all the necessary information, it is finally time to build a solver to obtain the Flag.
This challenge can be solved with the following solver.
import angr
import claripy
from logging import getLogger, WARN
getLogger("angr").setLevel(WARN + 1)
proj = angr.Project("app")
flag = claripy.BVS("flag", 0x44*8)
state = proj.factory.entry_state(stdin=flag)
for i in range(0x44):
state.solver.add(flag.get_byte(i) >= 0x21)
state.solver.add(flag.get_byte(i) <= 0x7f)
def correct(state):
if b"SUCCESS" in state.posix.dumps(1):
return True
return False
def failed(state):
if b"FAIL" in state.posix.dumps(1):
return True
return False
class OverrideSignal(angr.SimProcedure):
def run(self, sigid, handler):
sigid = sigid.concrete_value
handler = handler.concrete_value
self.state.globals["handlers"] = self.state.globals["handlers"].copy()
self.state.globals["handlers"][sigid] = handler
return 0
class OverrideRaise(angr.SimProcedure):
def run(self, sigid):
sigid = sigid.concrete_value
self.call(self.state.globals["handlers"][sigid], (sigid,), 0xFFFFFFFF)
class OverrideSleep(angr.SimProcedure):
def run(self, sigid):
sigid = 14
self.call(self.state.globals["handlers"][sigid], (sigid,), 0xFFFFFFFF)
class OverrideGetRandom(angr.SimProcedure):
def run(self, val):
res = self.state.globals["i"]
if res == 0x44:
print(self.state.posix.dumps(0))
input("")
self.state.globals["i"] += 1
self.state.mem[val].int = res
return 0
@proj.hook(0x4031F2)
def first_sigsegv(state):
state.regs.rsp -= 8
state.regs.rip = state.globals["handlers"][11]
state.regs.rdi = 11
state.globals["handlers"] = {
11: 0x4011E0,
14: 0x401bc0,
16: 0x401350,
17: 0x4014A0,
18: 0x402FB0,
21: 0x401550,
22: 0x402520
}
state.globals["i"] = 0
proj.hook_symbol("signal", OverrideSignal(), replace=True)
proj.hook_symbol("raise", OverrideRaise(), replace=True)
proj.hook_symbol("sleep", OverrideSleep(), replace=True)
proj.hook_symbol("getrandom", OverrideGetRandom(), replace=True)
simgr = proj.factory.simulation_manager(state)
while simgr.active:
simgr.explore(find=correct, avoid=failed)
if simgr.found:
print(simgr.found[0].solver.eval(flag, cast_to=bytes))
breakThe implementation looks complicated at first glance, but it is made up of the same things used up to this point.
First, in the opening section of code below, as usual I define the Project, declare the symbolic variable flag, and define the expected output results for the cases where the input is correct and incorrect.
import angr
import claripy
from logging import getLogger, WARN
getLogger("angr").setLevel(WARN + 1)
proj = angr.Project("app")
flag = claripy.BVS("flag", 0x44*8)
state = proj.factory.entry_state(stdin=flag)
for i in range(0x44):
state.solver.add(flag.get_byte(i) >= 0x21)
state.solver.add(flag.get_byte(i) <= 0x7f)
def correct(state):
if b"SUCCESS" in state.posix.dumps(1):
return True
return False
def failed(state):
if b"FAIL" in state.posix.dumps(1):
return True
return FalseIn the next part, I define classes for the four functions that will be overridden with SimProcedure.
This time, I override the signal, raise, sleep, and getrandom functions.
Overriding signal would be unnecessary if all handler functions were fixed, but because the callback function for SIGSEGV is implemented so that it changes dynamically, I override it as well and add the handler function to the dictionary called handlers, which defines the mapping between handler functions and signal numbers.
Also, the randomization of indices by getrandom, which was likely implemented as a dynamic-analysis countermeasure, is overridden so that it returns values incremented by 1 from the beginning, changing the comparison to check the correct Flag from the start in order.
class OverrideSignal(angr.SimProcedure):
def run(self, sigid, handler):
sigid = sigid.concrete_value
handler = handler.concrete_value
self.state.globals["handlers"] = self.state.globals["handlers"].copy()
self.state.globals["handlers"][sigid] = handler
return 0
class OverrideRaise(angr.SimProcedure):
def run(self, sigid):
sigid = sigid.concrete_value
self.call(self.state.globals["handlers"][sigid], (sigid,), 0xFFFFFFFF)
class OverrideSleep(angr.SimProcedure):
def run(self, sigid):
sigid = 14
self.call(self.state.globals["handlers"][sigid], (sigid,), 0xFFFFFFFF)
class OverrideGetRandom(angr.SimProcedure):
def run(self, val):
res = self.state.globals["i"]
if res == 0x44:
print(self.state.posix.dumps(0))
input("")
self.state.globals["i"] += 1
self.state.mem[val].int = res
return 0The section below uses a user hook to override the ret of the main function so that SIGSEGV can be caught.
@proj.hook(0x4031F2)
def first_sigsegv(state):
state.regs.rsp -= 8
state.regs.rip = state.globals["handlers"][11]
state.regs.rdi = 11Here, besides modifying registers such as the stack pointer, I let execution continue while bypassing the exception by setting rip to the SIGSEGV handler function.
In the following code, I turn the initial values of the handler functions confirmed from the lines that use the signal function into a table.
state.globals["handlers"] = {
11: 0x4011E0,
14: 0x401bc0,
16: 0x401350,
17: 0x4014A0,
18: 0x402FB0,
21: 0x401550,
22: 0x402520
}Finally, I hook each symbol function and run analysis with SimulationManager.
state.globals["i"] = 0
proj.hook_symbol("signal", OverrideSignal(), replace=True)
proj.hook_symbol("raise", OverrideRaise(), replace=True)
proj.hook_symbol("sleep", OverrideSleep(), replace=True)
proj.hook_symbol("getrandom", OverrideGetRandom(), replace=True)
simgr = proj.factory.simulation_manager(state)
while simgr.active:
simgr.explore(find=correct, avoid=failed)
if simgr.found:
print(simgr.found[0].solver.eval(flag, cast_to=bytes))
breakWhen you run this, angr can identify the correct Flag without analyzing most of the implementation inside each handler function.
Summary
About a year ago, I wrote an article called “We Don’t Know angr Yet,” but this experience made me keenly realize that I still hardly understand angr even now.
I think it can be an extremely powerful analysis tool if used well, so I want to keep studying it.