This page has been machine-translated from the original page.
Introduction
I was working on picoCTF 2021, which ran until 2021/3/31. I was aiming to solve all the Reversing challenges, but unfortunately fell short.
Among the picoCTF 2021 Reversing problems, the “ARMssembly” series was extremely educational, so I’ll write a writeup for it here.
What I Learned
- ARM assembly mnemonics
- Key points for manually decompiling assembly code
Approach for the ARMssembly Series
The “ARMssembly” series consisted of 5 problems in total, and the approach was the same for all of them, so I’ll describe it upfront.
The “ARMssembly” series could be solved with the following steps:
- Read through the provided assembly code
-
Trace the flow from the main function
- Google things as needed
- Generate assembly from C source code on an ARM-based environment (a Raspberry Pi) and compare with the challenge code
- Solve
Let’s start with the first problem.
ARMssembly 0
Problem
This problem gives two variables and asks for the final output.
Description
What integer does this program print with arguments
4112417903and1169092511?
; Challenge code
.arch armv8-a
.file"chall.c"
.text
.align2
.globalfunc1
.typefunc1, %function
func1:
subsp, sp, #16
strw0, [sp, 12]
strw1, [sp, 8]
ldrw1, [sp, 12]
ldrw0, [sp, 8]
cmpw1, w0
bls.L2
ldrw0, [sp, 12]
b.L3
.L2:
ldrw0, [sp, 8]
.L3:
addsp, sp, 16
ret
.sizefunc1, .-func1
.section.rodata
.align3
.LC0:
.string"Result: %ld\n"
.text
.align2
.globalmain
.typemain, %function
main:
stpx29, x30, [sp, -48]!
addx29, sp, 0
strx19, [sp, 16]
strw0, [x29, 44]
strx1, [x29, 32]
ldrx0, [x29, 32]
addx0, x0, 8
ldrx0, [x0]
blatoi
movw19, w0
ldrx0, [x29, 32]
addx0, x0, 16
ldrx0, [x0]
blatoi
movw1, w0
movw0, w19
blfunc1
movw1, w0
adrpx0, .LC0
addx0, x0, :lo12:.LC0
blprintf
movw0, 0
ldrx19, [sp, 16]
ldpx29, x30, [sp], 48
ret
.sizemain, .-main
.ident"GCC: (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0"
.section.note.GNU-stack,"",@progbitsReading main
Let’s trace from main first. It’s fairly long, but I focused on these items:
bl atoibl atoibl func1bl printf
bl stands for “Branch with Link” and can be understood as something like a CALL instruction.
For details, see Arm64(ARMv8) Assembly Programming (08) Branch Instructions.
In short, it jumps to the address written right after bl and returns when it hits a RET instruction.
Since there are two atoi calls, it’s clear that the program converts the received variables to numbers, passes them as arguments to func1, and displays the return value with printf.
Verifying with C code
Having read this far, let me write the reverse-engineered C code for the main function and verify my understanding.
Here’s what I came up with:
#include <stdio.h>
#include <stdlib.h>
unsigned int func1(unsigned int n1, unsigned int n2)
{
return 0;
}
int main(char a[128], char b[128]) {
unsigned int n1 = atoi(a);
unsigned int n2 = atoi(b);
unsigned int ans = func1(n1, n2);
printf("%u", ans);
return 0;
}Compile this on a Raspberry Pi with GCC using gcc -S sample.c -o sample.lst to generate an object file.
Extracting just the main function, the following assembly is generated. Comparing with the challenge code, it’s nearly identical!
main:
.LFB7:
.cfi_startproc
stp x29, x30, [sp, -48]!
.cfi_def_cfa_offset 48
.cfi_offset 29, -48
.cfi_offset 30, -40
mov x29, sp
str x0, [sp, 24]
str x1, [sp, 16]
ldr x0, [sp, 24]
bl atoi
str w0, [sp, 36]
ldr x0, [sp, 16]
bl atoi
str w0, [sp, 40]
ldr w1, [sp, 40]
ldr w0, [sp, 36]
bl func1
str w0, [sp, 44]
ldr w1, [sp, 44]
adrp x0, .LC0
add x0, x0, :lo12:.LC0
bl printf
mov w0, 0
ldp x29, x30, [sp], 48
.cfi_restore 30
.cfi_restore 29
.cfi_def_cfa_offset 0
ret
.cfi_endprocReading func1
Next, let’s look at func1, which receives the two arguments. This is the relevant part of the challenge code:
func1:
subsp, sp, #16
strw0, [sp, 12]
strw1, [sp, 8]
ldrw1, [sp, 12]
ldrw0, [sp, 8]
cmpw1, w0
bls.L2
ldrw0, [sp, 12]
b.L3
.L2:
ldrw0, [sp, 8]
.L3:
addsp, sp, 16
ret
.sizefunc1, .-func1
.section.rodata
.align3First, let’s look at the str and ldr instructions.
Simply put, str is a store instruction that writes a register’s value to a specified address.
ldr is a load instruction that reads a value from a specified address into a register.
[sp, 12] uses register-indirect addressing—one of the CPU’s memory addressing modes—which specifies the address obtained by adding the given offset to the stack pointer.
So func1 loads the received argument values, compares them, and branches based on the result.
Let’s look at the branch instruction bls.
ls stands for “lower or same (<=)“.
From this, func1 can be written in C as:
#include <stdio.h>
#include <stdlib.h>
unsigned int func1(unsigned int n1, unsigned int n2)
{
if (n2 > n1)
{
return n1;
}
else
{
return n2;
}
}I wrote bls as “lower or same (<=)”, but the jump to the branch target happens when the IF condition is not satisfied, so the C condition expression is n1 > n2.
Also, function arguments are pushed onto the stack from the last one first, so the value stored in [sp, 12], which is accessed first, corresponds to the first argument (n1).
Let’s generate the object file from this code:
func1:
.LFB6:
.cfi_startproc
subsp, sp, #16
.cfi_def_cfa_offset 16
strw0, [sp, 12]
strw1, [sp, 8]
ldrw1, [sp, 8]
ldrw0, [sp, 12]
cmpw1, w0
bls.L2
ldrw0, [sp, 12]
b.L3
.L2:
ldrw0, [sp, 8]
.L3:
addsp, sp, 16
.cfi_def_cfa_offset 0
ret
.cfi_endprocThe assembly nearly matches the challenge code, confirming our understanding is correct.
Finally, running the compiled binary with the given arguments produces the number sequence that is the FLAG.
ARMssembly 1
Problem
This problem asks for the argument that makes the program print “win”. The code is a bit longer than the first problem.
Description
For what argument does this program print
winwith variables81,0and3?
.arch armv8-a
.file"chall_1.c"
.text
.align2
.globalfunc
.typefunc, %function
func:
subsp, sp, #32
strw0, [sp, 12]
movw0, 81
strw0, [sp, 16]
strwzr, [sp, 20]
movw0, 3
strw0, [sp, 24]
ldrw0, [sp, 20]
ldrw1, [sp, 16]
lslw0, w1, w0
strw0, [sp, 28]
ldrw1, [sp, 28]
ldrw0, [sp, 24]
sdivw0, w1, w0
strw0, [sp, 28]
ldrw1, [sp, 28]
ldrw0, [sp, 12]
subw0, w1, w0
strw0, [sp, 28]
ldrw0, [sp, 28]
addsp, sp, 32
ret
.sizefunc, .-func
.section.rodata
.align3
.LC0:
.string"You win!"
.align3
.LC1:
.string"You Lose :("
.text
.align2
.globalmain
.typemain, %function
main:
stpx29, x30, [sp, -48]!
addx29, sp, 0
strw0, [x29, 28]
strx1, [x29, 16]
ldrx0, [x29, 16]
addx0, x0, 8
ldrx0, [x0]
blatoi
strw0, [x29, 44]
ldrw0, [x29, 44]
blfunc
cmpw0, 0
bne.L4
adrpx0, .LC0
addx0, x0, :lo12:.LC0
blputs
b.L6
.L4:
adrpx0, .LC1
addx0, x0, :lo12:.LC1
blputs
.L6:
nop
ldpx29, x30, [sp], 48
ret
.sizemain, .-main
.ident"GCC: (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0"
.section.note.GNU-stack,"",@progbitsReading main
Let’s trace through main. Instructions introduced in previous problems are skipped.
The important part in main is the last section.
Just like the previous problem, it receives an argument, passes it to the func function, and then determines whether to display “win” or “lose” based on the return value compared to 0.
Let me write this part out in C.
bne .L4 means “not equal”.
In other words, if func1's return value != 0, it jumps to the address specified by .L4.
Verifying with C code
Having read this far, let me write the reverse-engineered C code for main and verify.
Here’s what I came up with:
#include <stdio.h>
#include <stdlib.h>
unsigned int func1(unsigned int n1)
{
return 0;
}
int main(char a[128]) {
unsigned int n1 = atoi(a);
unsigned int ret = func1(n1);
if (ret == 0)
{
printf("win");
}
else
{
printf("lose");
}
return 0;
}Compiling this on a Raspberry Pi with gcc -S sample.c -o sample.lst generates the following assembly for main.
Comparing with the challenge code, they match.
main:
.LFB7:
.cfi_startproc
stpx29, x30, [sp, -48]!
.cfi_def_cfa_offset 48
.cfi_offset 29, -48
.cfi_offset 30, -40
movx29, sp
strx0, [sp, 24]
ldrx0, [sp, 24]
blatoi
strw0, [sp, 40]
ldrw0, [sp, 40]
blfunc1
strw0, [sp, 44]
ldrw0, [sp, 44]
cmpw0, 0
bne.L4
adrpx0, .LC0
addx0, x0, :lo12:.LC0
blprintf
b.L5
.L4:
adrpx0, .LC1
addx0, x0, :lo12:.LC1
blprintf
.L5:
movw0, 0
ldpx29, x30, [sp], 48
.cfi_restore 30
.cfi_restore 29
.cfi_def_cfa_offset 0
ret
.cfi_endprocReading func1
Now I know the win condition: if the return value of func1 with the given argument is 0, “win” is displayed.
Let’s look at func1:
func:
subsp, sp, #32
; 1. Store the argument in [sp, 12]
strw0, [sp, 12]
; 2. Store 81 in [sp, 16]
movw0, 81
strw0, [sp, 16]
; 3. Store 0 in [sp, 20]
strwzr, [sp, 20]
; 4. Store 3 in [sp, 24]
movw0, 3
strw0, [sp, 24]
; 5. Load [sp, 20] and [sp, 16], then left-shift
ldrw0, [sp, 20]
ldrw1, [sp, 16]
lslw0, w1, w0
strw0, [sp, 28]
; 6. Divide result of 5 by [sp, 24]
ldrw1, [sp, 28]
ldrw0, [sp, 24]
sdivw0, w1, w0
strw0, [sp, 28]
; 7. Subtract the argument passed to func1 from result of 6, then return
ldrw1, [sp, 28]
ldrw0, [sp, 12]
subw0, w1, w0
strw0, [sp, 28]
ldrw0, [sp, 28]
addsp, sp, 32
retI’ve annotated the challenge code to make the variable flow clearer.
- Store the argument in [sp, 12]
- Store 81 in [sp, 16]
- Store 0 in [sp, 20]
- Store 3 in [sp, 24]
Steps 1, 2, and 4 were covered earlier, so I’ll skip them.
For step 3, using wzr is a zero register expression.
This allows storing 0 directly at the specified address without going through another register.
Reference: ARM Cortex-A Series Programmer’s Guide for ARMv8-A
Now let me look at the operations after the variable assignments.
- Load [sp, 20] and [sp, 16], then left-shift
lsl is a logical shift left instruction.
Data is shifted left by the specified amount, and vacated bits are filled with 0.
Steps 6 and 7 take the result of this left shift, divide by 3, and subtract the argument. In other words, the argument that makes this final expression equal to 0 is the flag.
We can now solve this easily, but for completeness, let me write func1 in C too.
When I compiled this, I couldn’t exactly reproduce the sdiv part for some reason, but it’s approximately correct:
unsigned int func1(unsigned int n1)
{
unsigned int n2 = 81;
unsigned int n3 = 0;
unsigned int n4 = 3;
int ret;
ret = n2 << n3;
ret = ret / 3;
ret = ret - n1;
return ret;
}The argument that makes func1 return 0 is the number sequence that is the FLAG.
Summary
Carefully working through assembly code and rewriting it in C by hand was a very enjoyable experience. Thanks to the problem authors.
I may add writeups for the remaining problems later.
As a side note, here are some books I recommend for studying assembly:
Recommended Books
-
- Pros: Covers pretty much everything. I used this book as a reference when reading ARM assembly.
- Cons: Thick, expensive, seems difficult. I haven’t finished it yet lol.
-
- Pros: Misaki-chan is cute. Reads like a story. Easy to understand for beginners.
- Cons: Old (Windows XP era)
-
- Pros: Recommended especially for people who have “absolutely no idea about assembly.” Easy to understand with 8-bit assembly, and lots of practice problems (for the basic IT engineer exam) and explanations are available.
- Cons: Can’t verify behavior by writing your own C source code. Differences from 32-bit assembly need to be studied separately.