All Articles

How to Generate Symbol Files (.pdb) in a Linux Environment Using llvm-mingw

This page has been machine-translated from the original page.

My goal is to become proficient with WinDbg for Windows debugging and dump-based troubleshooting.

For a full list of articles on Windows debugging and dump analysis with WinDbg, see the index page:

Reference: Debugging and Troubleshooting Techniques with WinDbg

This article introduces the build environment used to compile the sample programs featured in the WinDbg series above.

I’ll be setting up the environment on Ubuntu 20.04 running on WSL2 to satisfy the following requirements. The setup uses Docker containers, so it should work in any environment where Docker is available.

  1. Cross-compile EXE files in a Linux environment
  2. Generate symbol files (.pdb files) in a Linux environment

Table of Contents

Setting Up the Build Environment

The sample programs for WinDbg testing are hosted in the following repository:

Reference: kash1064/Try2WinDbg

First, clone the repository into any directory on an OS where Docker is available.

git clone https://github.com/kash1064/Try2WinDbg

Next, pull the following container image for compilation:

docker pull kashiwabayuki/try2windbg:1.0

Reference: kashiwabayuki/try2windbg

This container image is a customized version of the mstorsjo/llvm-mingw image. Details are described later.

Once the repository and container image have been downloaded, change into the Try2WinDbg directory.

Run the following commands to generate compiled EXE files and symbol files directly under Try2WinDbg/src:

cd Try2WinDbg

# Specify the container image to use for the build
CONTAINER=kashiwabayuki/try2windbg:1.0
docker run --rm -it -v `pwd`/src:/try2windbg $CONTAINER bash -c "cd /try2windbg && make"

The environment setup is now complete.

What Are Symbol Files (.pdb Files)?

Files with the .pdb extension are called symbol files.

PDB stands for Program Database. A PDB file maps identifiers and statements in a project’s source code to the corresponding identifiers and instructions in the compiled application.

Using symbol files makes it significantly more efficient to analyze applications and processes with a debugger.

Analysis is still possible without symbol files, but there is a notable difference in how the debugger displays information for the same address depending on whether an appropriate symbol file is loaded:

sample+0x110     # Without symbol file
sample!main+0x10 # With symbol file

Loading symbol files properly allows you to quickly identify suspect locations, infer behavior from function names, and debug more efficiently overall.

Reference: Specify symbol (.pdb) and source files in the Visual Studio debugger | Microsoft Docs

How to Generate Symbol Files During Compilation in a Linux Environment

Symbol files are critically important when debugging Windows applications. When using the Microsoft compiler, they are generated automatically at build time.

However, when cross-compiling in a Linux environment using tools like MinGW, symbol files are not normally produced.

Some resources — such as the Stack Overflow thread linked below — suggest using cv2pdb to create symbol files for MinGW cross-compiled EXE files, but this approach does not work on Linux.

Reference: c++ - how to generate pdb files while building library using mingw? - Stack Overflow

Therefore, I used llvm-mingw instead.

llvm-mingw is a mingw-w64 toolchain based on LLVM/Clang/LLD.

Reference: mstorsjo/llvm-mingw: An LLVM/Clang/LLD based mingw-w64 toolchain

In short, LLVM is a platform-independent compiler infrastructure capable of compiling any programming language. Clang and LLD are the C compiler and linker for LLVM respectively.

llvm-mingw is essentially a version of MinGW where the GNU-based binutils have been replaced by LLVM-based binutils.

This makes it possible to compile for multiple computer architectures (i686, x86_64, armv7, arm64) with a single toolchain, and also enables generating symbol files in PDB format.

Preparing the llvm-mingw Environment

The easiest way to get an environment with LLVM-based MinGW is to use the official Docker image.

In most cases, simply pulling this image from Docker Hub is all you need.

If you need to set up the llvm-mingw environment directly on a Linux host rather than in a Docker container, you can refer to the scripts in the following Dockerfile:

Reference: llvm-mingw/Dockerfile.cross at master · mstorsjo/llvm-mingw

Compiling a C++ File with a PDB File Using llvm-mingw

Here is how to use llvm-mingw.

In the official Docker image, the LLVM-based MinGW compiler is already on the PATH as x86_64-w64-mingw32-g++.

Pass the -Wl,-pdb=<filename>.pdb option when compiling to generate both an EXE file and a symbol file at the same time:

x86_64-w64-mingw32-g++ -Wl,-pdb=sample.pdb sample.cpp -o sample.exe

Wrap-up

In this article, I summarized how to generate a debug symbol file when cross-compiling a Windows EXE in a Linux environment.