This page has been machine-translated from the original page.
I participated in ångstromCTF 2021. I joined with the goal of clearing all of the Reversing challenges again this time, but unfortunately I could solve only 3 of the 11 problems…
This time, among the problems I somehow managed to solve, I’m writing a writeup for Infinity Gauntlet, which taught me especially a lot.
About This Article
The content of this article is not intended to encourage actions that violate public order.
Please note in advance that attempting attacks against environments you do not own or are not authorized to use may violate the Act on the Prohibition of Unauthorized Computer Access (Unauthorized Access Prohibition Act).
Also, all opinions expressed here are my own and do not represent any organization I belong to.
About the CTF Writeup Series
I’m writing this series partly for my own study, and I’ll try to explain CTF challenges carefully enough that even beginners can follow along.
To be honest, I think CTFs are a pretty difficult genre for someone to jump into cold. When I first entered a competition, I couldn’t solve a single problem, and even after reading writeups by veteran CTF players, I still understood nothing.
So in this series, as part of my own learning, I aim to explain the steps to obtaining the flag as clearly and carefully as possible.
At the same time, I’m still relatively new to CTFs myself, so if you notice any mistakes, I’d really appreciate it if you pointed them out.
Challenge Overview
All clam needs to do is snap and finite will turn into infinit…
I don’t really understand what the prompt means, but when you run the downloaded executable, it displays the following challenge text and asks for input.
$./infinity_gauntlet
Welcome to the infinity gauntlet!
If you complete the gauntlet, you'll get the flag!
=== ROUND 1 ===
bar(?, 108, 377) = 102484
100
Wrong!There are seven possible patterns for the questions it asks.
// foo function
foo(?, %u) = %u
foo(%u, ?) = %u
foo(%u, %u) = ?
// bar function
bar(?, %u, %u) = %u
bar(%u, ?, %u) = %u
bar(%u, %u, ?) = %u
bar(%u, %u, %u) = ?When you answer one of these correctly, the ROUND value is updated and the next question is displayed.
What I Learned This Time
- How to automate the execution of an interactive program with Python
- How to read a little assembly
Solution
I’ll describe the overall solution first. I especially struggled with step 3, obtaining the FLAG.
- Statically analyze the provided executable to understand where the FLAG string is stored and how it is stored
- Use GDB to understand the details of the
fooandbarfunctions - Statically analyze the provided executable to understand how to obtain the FLAG
- Write a solver that automates answering the questions and retrieving the FLAG
1. Understand where the FLAG string is stored and how it is stored
First, if you try to run the provided executable locally, you get the following error.
$./infinity_gauntlet
Couldn't find a flag file.If you decompile it in Ghidra, you can see that it reads flag.txt from the same directory at runtime.
local_40 = *(long *)(in_FS_OFFSET + 0x28);
setvbuf(stdout,(char *)0x0,2,0);
__stream = fopen("flag.txt","r");
// If reading flag.txt fails
if (__stream == (FILE *)0x0) {
puts("Couldn\'t find a flag file.");
uVar6 = 1;
}When reading flag.txt succeeds, it seems to execute the following processing.
I renamed the variables arbitrarily.
__s = FLAG;
fgets((char *)__s,0x100,__stream);
fclose(__stream);
sVar4 = strcspn((char *)__s,"\n");
iVar1 = (int)sVar4;
FLAG[iVar1] = 0;
if (iVar1 != 0) {
bVar7 = 0;
do {
*__s = *__s ^ bVar7;
bVar7 = bVar7 + 0x11;
__s = __s + 1;
} while (bVar7 != (byte)((char)sVar4 * '\x11'));
...
}As you can tell from the decompiled code, it appears to do the following.
- Take the string obtained from
flag.txtone character at a time - Compute
flag character XOR (0x11 * zero-based character index) - Store the result in the buffer where the encrypted flag is kept
The address of this storage location will show up again later, so it’s worth remembering. I think it’s helpful to give it a label name in Ghidra.
2. Understand the details of the foo and bar functions
Next, to solve the questions it asks, we need to understand their nature.
I’m calling them functions for convenience, but the logic itself was written inside main().
Tracing both of them directly from the assembly source is a lot of work, so I analyzed them with GDB.
From the disassembly results, I found that the following code handles receiving the input and determining whether the answer is correct. So I set a breakpoint at this address and analyzed it in GDB.
0010125a e8 71 fe CALL __isoc99_scanf undefined __isoc99_scanf()
ff ff
0010125f 39 5c 24 0c CMP dword ptr [RSP + local_14c],EBX
00101263 0f 85 c7 JNZ LAB_00101430
01 00 00
00101269 83 c5 01 ADD ebp,0x1
0010126c 48 8d 3d LEA RDI,[s_Correct!_Maybe_round_%d_will_get_001021 = "Correct! Maybe round %d will
bd 0e 00 00
00101273 31 c0 XOR EAX,EAX
00101275 89 ee MOV ESI,ebp
00101277 e8 e4 fd CALL printf int printf(char * __format, ...)By running it several times while bypassing the correct/incorrect branch by rewriting the zero flag, I learned the following in combination with the assembly above.
- At
0x0010125f, it compares the input value with the value in theEBXregister to determine whether the answer is correct. - If the answer is correct,
1is added to the value in theEBPregister.
At this point, by reading the value of the EBX register in GDB when the program checks the answer, you can learn the correct answer and use that as a clue to discover the rules of each function.
(Unfortunately, the process of reverse-engineering each formula would make this article too long, so I’ll omit it.)
After some trial and error, I was able to figure out the logic foo and bar use to determine each value in the problem statement.
foo(A, B) = C
C = A ^ (B + 1) ^ 0x539bar(A, B, C) = D
D = B * (C + 1) + AEvery question the program gives is a fill-in-the-blank version of either the foo or bar formula, so using these formulas lets you answer all of them correctly.
I created an automation script and tried solving the questions.
=== ROUND 44 ===
bar(1160, ?, 58) = 124529
2091
Correct! Maybe round 45 will get you the flag ;)
=== ROUND 45 ===
foo(?, 355) = 38988
39953
Correct! Maybe round 46 will get you the flag ;)
=== ROUND 46 ===
foo(39, ?) = 41440
42237
Correct! Maybe round 47 will get you the flag ;)However, even after answering more than 10,000 questions correctly, I still couldn’t get the FLAG. So I had to continue analyzing the program to figure out how the FLAG can actually be obtained.
3. Understand how to obtain the FLAG
Now let’s think about how the FLAG can be obtained.
Earlier, we confirmed that the ebp register is incremented when you answer correctly, so I guessed that this was related and traced the code with that assumption in mind.
Then I found that it compares the value in ebp with 0x31, and jumps to 0x1504 when it is greater.
0x00001291 83fd31 cmp ebp, 0x31
0x00001294 0f8f6a020000 jg 0x1504So I looked at what happens after 0x1504.
It turns out that once you solve more than 50 questions, the answer to the challenge that would normally be generated randomly (the value in the EBX register) stops being random and is instead created by the following processing!
0x00001504 99 cdq
0x00001505 41f7fe idiv r14d
0x00001508 8d1c2a lea ebx, [rdx + rbp]
0x0000150b 4863d2 movsxd rdx, edx
0x0000150e 0fb6441410 movzx eax, byte [rsp + rdx + 0x10]
0x00001513 0fb6db movzx ebx, bl
0x00001516 c1e308 shl ebx, 8
0x00001519 09c3 or ebx, eax
0x0000151b e98bfdffff jmp 0x12ab ;start of the next problem setupThis is hard to read as-is, so let’s look at Ghidra’s decompiled output.
EBX = (FlagLength % iVar1 + current_correct_count & 0xff) << 8 | \
(uint)FLAGARR[FlagLength % iVar1];It seems that the program stores a value in EBX using the flag string it encrypted at the beginning.
More specifically, EBX stores the result of OR-ing:
- the low 8 bits of
current number of correct answers + flag character position, shifted left by 8 bits - the encrypted flag converted to an
int
In other words, if EBX is 0x9c3f and the current number of correct answers is 152, you can recover the flag character as follows.
EBX 0x9c3f
Correct answers 0x98(152)
1. `0x9c - 0x98 = 4`, so the low byte corresponds to the 4th character
2. The low byte is `0x3f`, so you know the encrypted 4th flag character is `0x3f`
3. The encryption is `flag character XOR (0x11 * zero-based flag index)`, so for the 4th character it is
`0x3f XOR 0x33`
4. Therefore, you can tell that the 4th character of the FLAG is `{`Once you get this far, the rest is easy: write a script that automatically solves the questions while keeping track of the current number of correct answers, and once the count exceeds 50, just run the decode process above to recover the FLAG string.
…that was a lie!! If you stop there, the 16th character onward gets garbled.
I got pretty stuck here at first because I had no idea why this was happening.
After thinking about it for a while, I realized that in this part where the flag characters are XOR-encrypted by multiples of 0x11, the register actually used for the encryption is the cl register.
0x00001190 300a xor byte [rdx], cl
0x00001192 83c111 add ecx, 0x11The cl register is the lower 8 bits of the ecx register, which means the value used in the XOR operation here is also just 1 byte.
When I calculated the multiples of 0x11, I found that the 15th multiple is exactly 0xFF, and from the 16th multiple onward the value no longer fits in 8 bits.
By the time I first ran the solver, I already knew the FLAG was 26 characters long, so I fixed the solver so that it would also XOR with 256, preventing the 16th through 26th characters from becoming garbled, and that finally gave me the FLAG.
4. Write a solver to automate answering the questions and retrieving the FLAG
From here on, this is completely extra as a writeup, but since this was my first time automating the interactive execution of an ELF with a Python script, I wanted to add it.
To run a program interactively from Python, use pexpect.
Usage is very simple: start a process by specifying the CLI command to run, and when a particular string appears in the output, you can provide any input you want at that timing.
Below is a summary of the tips I used this time.
Start the program
child = pexpect.spawn ('command to run', logfile=sys.stdout.buffer)
...
child.close()In the command to run field, enter nc shell.actf.co 21700 or ./executable_name to launch a process for automating interactive command handling.
At this time, setting logfile=sys.stdout.buffer sends the output to standard output, so you can automate the processing while keeping a feel close to running the program directly from the console.
Send input at an arbitrary timing
child.expect(r'string to wait for (regular expression)')
child.sendline('string to send')expect() waits until a string matching the regular expression you passed in appears.
sendline sends a string followed by a newline to the process.
When expect() finds a match, it gives you the following.
before: the string that had been written to standard output before the part that matched the regular expressionafter: the string that matched the regular expressionbuffer: the string that had been written to standard output after the part that matched the regular expression at the moment of the match
In the solver I wrote this time, I matched every line with \n, then branched based on whether the immediately preceding output (the challenge prompt) contained foo or bar, and handled it accordingly.
The Solver I Created
import io
import os
import sys
import time
import re
import pexpect
arr = ["-1" for i in range(50)]
x11 = [i * 17 for i in range(50)]
def revflag(ans, rounds):
pos = (ans >> 8) - rounds
pos = pos
flag = (ans & 0x00ff)
if pos > 15:
flag = chr(flag ^ x11[pos] ^ 256)
else:
flag = chr(flag ^ x11[pos])
print("ans {}".format(hex(ans)))
print("pos {}".format(pos))
print("flag {}".format(flag))
arr[pos] = flag
return
def getfoo(S):
# foo(?, 13) = 11231
reA = r"^foo\(([0-9]{1,9}|\?),"
reB = r",\s([0-9]{1,9}|\?)\)"
reC = r"=\s([0-9]{1,9}|\?)"
A = re.findall(reA, S)[0]
B = re.findall(reB, S)[0]
C = re.findall(reC, S)[0]
return A, B, C
def getbar(S):
# bar(?, 305, 449) = 138744
reA = r"^bar\(([0-9]{1,9}|\?),"
reB = r",\s([0-9]{1,9}|\?),"
reC = r",\s([0-9]{1,9}|\?)\)"
reD = r"=\s([0-9]{1,9}|\?)"
A = re.findall(reA, S)[0]
B = re.findall(reB, S)[0]
C = re.findall(reC, S)[0]
D = re.findall(reD, S)[0]
return A, B, C, D
def foo(A, B, C):
ans = 0
if A == '?':
cd = int(C) ^ 1337
ans = (int(B) + 1) ^ cd
if B == '?':
cd = int(C) ^ 1337
ans = (int(A) ^ cd) - 1
if C == '?':
ans = int(A) ^ (int(B) + 1) ^ 1337
return int(ans)
def bar(A, B, C, D):
ans = 0
if A == '?':
bd = int(B) * (int(C) + 1)
ans = int(D) - bd
if B == '?':
dd = int(D) - int(A)
ans = (dd // (int(C) + 1))
if C == '?':
dd = int(D) - int(A)
ans = (dd // int(B)) - 1
if D == '?':
ans = int(B) * (int(C)+1) + int(A)
return int(ans)
child = pexpect.spawn ('nc shell.actf.co 21700', logfile=sys.stdout.buffer)
counter = 1
while(True):
try:
child.expect(r'\n')
S = str(child.before)
# print(S[2:-3])
if counter < 50:
if "bar" in S:
counter += 1
A, B, C, D = getbar(S[2:-3])
# print(A, B, C, D)
ans = bar(A, B, C, D)
child.sendline(str(ans))
if "foo" in S:
counter += 1
A, B, C = getfoo(S[2:-3])
ans = foo(A, B, C)
child.sendline(str(ans))
else:
if "bar" in S:
A, B, C, D = getbar(S[2:-3])
# print(A, B, C, D)
ans = bar(A, B, C, D)
revflag(ans, counter)
print("Count : {} Ans : {}".format(hex(counter), hex(ans)))
child.sendline(str(ans))
counter += 1
if "foo" in S:
A, B, C = getfoo(S[2:-3])
ans = foo(A, B, C)
revflag(ans, counter)
print("Count : {} Ans : {}".format(hex(counter), hex(ans)))
child.sendline(str(ans))
counter += 1
if counter > 254:
break
except Exception as e:
print(e)
break
child.close()
print("".join(arr))Summary
So, from the Reversing challenges in ångstromCTF 2021, I wrote up my attempt at Infinity Gauntlet.
It took quite a while to solve completely (about 5 hours), but by taking the time to work through this problem carefully, I feel like I deepened my knowledge and understanding of assembly and registers quite a bit.
I plan to keep challenging myself with various Reversing problems in the future.
References & Books I Used While Solving the Challenge
Books
-
- When I’m reading assembly, this is the book I refer to as needed.
-
- It’s aimed at PE modules, but I always use it as a reference when tracing program flow.
- Since it’s a 2004 book, note that some of the content is a little outdated.
-
リバースエンジニアリングツールGhidra実践ガイド ~セキュリティコンテスト入門からマルウェア解析まで~
- This is almost the only book written in Japanese about Ghidra.
- It was written by members who are extremely strong at CTFs, and the content is also very easy to understand (not that I’m saying I fully understand it).