All Articles

Magical WinDbg VOL.1 [Chapter 5: Analyzing a Full Memory Dump from a System Crash]

This page has been machine-translated from the original page.

In Chapter 4, we analyzed a simple application crash dump.

In Chapter 5, which follows, we will analyze a full memory dump collected when a simple system crash occurred.

Table of Contents

Triggering a system crash and creating a full memory dump

As we saw in Chapter 4, when an exception occurs in the processing of an application running in user mode, the application crashes and an application crash dump is generated.

Similarly, if an exception occurs inside a system process running in kernel mode, a system crash called a BSOD (Blue Screen of Death) occurs, and a system crash dump is generated.

A typical BSOD screen

There are several kinds of system crash dumps, but the full memory dump we will capture this time contains all page information from all physical memory accessible to the Windows system running on that machine.

A full memory dump is not generated under Windows default settings, but if your environment has been configured to collect dump files according to the procedure in Chapter 1, the setting to capture a full memory dump is already in place.

Therefore, we will use D4C.exe to obtain a full memory dump for analysis.

First, run the D4C.exe downloaded in Chapter 1, enter 3 in the menu shown at the prompt, and press Enter.

Trigger a system crash with D4C.exe

When you execute option 3 in D4C.exe, a system crash occurs and the machine automatically reboots.

After the system restarts, if a file named FULL_MEMORY.DMP has been created directly under the C:\Windows folder, the full memory dump has been successfully captured.

Loading the full memory dump in WinDbg

When analyzing a full memory dump, just as with an application crash dump, launch the 64-bit version of WinDbg as administrator and use the [Ctrl + D] shortcut to load the dump file you captured from that folder.

Once loading finishes and the message For analysis of this file, run !analyze -v appears, the dump is ready. (Depending on the full memory dump file size, loading may take a little time.)

Load the full memory dump into WinDbg

Analyzing the crash dump with the !analyze extension

As with application crash dump investigations, the !analyze -v command is extremely useful when investigating a system crash.

So let’s go through the output you get by running !analyze -v in the Command window, section by section.

The first section shows the analysis result of the bug check data1 that can be obtained with the .bugcheck command, as shown below.

CRITICAL_PROCESS_DIED (ef)
        A critical system process died
Arguments:
Arg1: ffffc18f36303080, Process object or thread object
Arg2: 0000000000000000, If this is 0, a process died. If this is 1, a thread died.
Arg3: 0000000000000000, The process object that initiated the termination.
Arg4: 0000000000000000

By referring to the output above, we can see that the cause of the system crash is CRITICAL_PROCESS_DIED.

If you actually run the .bugcheck command in WinDbg, you get the following output.

6: kd> .bugcheck
Bugcheck code 000000EF
Arguments ffffc18f`36303080 00000000`00000000 00000000`00000000 00000000`00000000

Windows bug check codes are published in the official documentation below.

If you look up the value 000000EF in that reference, you can see that it matches the bug check code for CRITICAL_PROCESS_DIED shown by !analyze -v.

Also, the output of !analyze -v is equivalent to analyzing the bug check with the !analyze -show <bug check code> <Arg1> command.


Bug check codes:

https://learn.microsoft.com/ja-jp/windows-hardware/drivers/debugger/bug-check-code-reference2#bug-check-codes


From the output of these commands, we can see that the system crash was CRITICAL_PROCESS_DIED (0xef), caused by the termination of a critical Windows system process.

So what exactly is meant by a critical Windows system process?

Critical Windows system processes include built-in system processes such as csrss.exe, wininit.exe, logonui.exe, smss.exe, services.exe, conhost.exe, and winlogon.exe.


Bug Check 0xEF: CRITICAL_PROCESS_DIED

https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0xef—critical-process-died


In addition, because the value of Arg2 is 0, we can tell that the cause was the termination of a process rather than a thread, and that the actual stopped process object exists at 0xffffc18f36303080.

If we inspect the information managed by the EPROCESS structure at that address, we can identify the stopped system process as svchost.exe with PID 0x3bc.

6: kd> !process ffffc18f`36303080 0
PROCESS ffffc18f36303080
    SessionId: 0  Cid: 02dc    Peb: 27dedd6000  ParentCid: 03bc
    DirBase: 7fbd5002  ObjectTable: ffffe50cc9052800  HandleCount: 1465.
    Image: svchost.exe

6: kd> dt nt!_EPROCESS 0xffffc18f`36303080
   {{ omitted }}
   +0x440 UniqueProcessId  : 0x00000000`000002dc Void
   +0x5a8 ImageFileName    : [15]  "svchost.exe"
   {{ omitted }}

As you can see, just by analyzing the bug check based on the initial output of !analyze -v, we were already able to get very close to the cause of the system crash.

Let’s continue to the next section.

The section after the bug check analysis also contains very interesting information. From CriticalProcessDied.Process, we can again see that the crashed process was svchost.exe.

Debugging Details:
------------------

KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.mSec
    Value: 3765

    Key  : Analysis.DebugAnalysisManager
    Value: Create

    Key  : Analysis.Elapsed.mSec
    Value: 4092

    Key  : Analysis.Init.CPU.mSec
    Value: 78343

    Key  : Analysis.Init.Elapsed.mSec
    Value: 78474127

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 128

    Key  : CriticalProcessDied.ExceptionCode
    Value: 42bd7080

    Key  : CriticalProcessDied.Process
    Value: svchost.exe

    Key  : WER.OS.Branch
    Value: vb_release

    Key  : WER.OS.Timestamp
    Value: 2019-12-06T14:06:00Z

    Key  : WER.OS.Version
    Value: 10.0.19041.1

The following section outputs the information below.

FILE_IN_CAB:  FULL_MEMORY.DMP

BUGCHECK_CODE:  ef

BUGCHECK_P1: ffffc18f36303080

BUGCHECK_P2: 0

BUGCHECK_P3: 0

BUGCHECK_P4: 0

PROCESS_NAME:  svchost.exe

CRITICAL_PROCESS:  svchost.exe

ERROR_CODE: (NTSTATUS) 0x42bd7080 - <Unable to get error code text>

BLACKBOXBSD: 1 (!blackboxbsd)

BLACKBOXNTFS: 1 (!blackboxntfs)

BLACKBOXPNP: 1 (!blackboxpnp)

BLACKBOXWINLOGON: 1

The information from BUGCHECK_CODE through BUGCHECK_P4 matches the bug check information we saw in the first section.

Also, CRITICAL_PROCESS tells us that the stopped process that caused the system crash was svchost.exe.

The next section is the stack backtrace.

Here, the output is equivalent to running the .cxr; .ecxr ; kb command.

STACK_TEXT:  
ffffa209`4982f838 fffff801`2d70e592 : {{ omitted }} : nt!KeBugCheckEx
ffffa209`4982f840 fffff801`2d616045 : {{ omitted }} : nt!PspCatchCriticalBreak+0x10e
ffffa209`4982f8e0 fffff801`2d4819b0 : {{ omitted }} : nt!PspTerminateAllThreads+0x15e655
ffffa209`4982f950 fffff801`2d4817ac : {{ omitted }} : nt!PspTerminateProcess+0xe0
ffffa209`4982f990 fffff801`2d2105f5 : {{ omitted }} : nt!NtTerminateProcess+0x9c
ffffa209`4982fa00 00007ff8`7accd3d4 : {{ omitted }} : nt!KiSystemServiceCopyEnd+0x25
00000087`d59ef568 00000000`00000000 : {{ omitted }} : ntdll!NtTerminateProcess+0x14

The KeBugCheckEx function at the top of the stack backtrace is the function that directly triggers a system crash.2

The KeBugCheckEx function receives a stop code and four parameters whose meanings depend on that stop code.

That corresponds to the bug check information we just examined with the .bugcheck command and similar output.

Conversely, the NtTerminateProcess function at the bottom of the stack backtrace is, as documented below, a function normally used when a user-mode application calls an API to terminate a process.


ZwTerminateProcess function (ntddk.h):

https://learn.microsoft.com/ja-jp/windows-hardware/drivers/ddi/ntddk/nf-ntddk-zwterminateprocess


Considering the stack backtrace output together with the bug check information, we can conclude that it is highly likely that some user-mode application terminated svchost.exe, which is a critical system process, and that this caused the system crash.

In the following output, which appears in the last section of !analyze -v, FAILURE_BUCKET_ID is also shown as 0xEF_svchost.exe_BUGCHECK_CRITICAL_PROCESS_42bd7080_ntdll!NtTerminateProcess, so we can judge that this system crash occurred because svchost.exe was terminated by the NtTerminateProcess function.

SYMBOL_NAME:  ntdll!NtTerminateProcess+14

MODULE_NAME: ntdll

IMAGE_NAME:  ntdll.dll

STACK_COMMAND:  .cxr; .ecxr ; kb

BUCKET_ID_FUNC_OFFSET:  14

FAILURE_BUCKET_ID:  0xEF_svchost.exe_BUGCHECK_CRITICAL_PROCESS_42bd7080_ntdll!NtTerminateProcess

OS_VERSION:  10.0.19041.1

BUILDLAB_STR:  vb_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {f6ece2b4-3d35-e4e4-9739-fdbc46a086b0}

Followup:     MachineOwner

Identifying the process that caused the crash

From the output of !analyze -v, we were able to identify the exception that directly caused the crash.

However, the stack backtrace shown by !analyze -v did not include any information earlier than ntdll!NtTerminateProcess+0x14, so we still could not determine exactly what caused the exception.

This is because the context associated with this exception is the process context of the terminated svchost.exe.

You can check the current process context associated with the exception by running .ecxr; !peb. (The !peb extension3 displays the PEB (Process Environment Block) corresponding to the current process context.)

Because it was difficult to investigate further from the exception-related context selected by .ecxr, we need to switch the context to the process that may have caused the crash in order to proceed with a more detailed investigation.

In this case, that process is D4C.exe.

If you display thread information for the exception-related context with the .ecxr; !thread command, you can see that Owning Process is D4C.exe, and the address of the process object is shown as 0xffffc18f45284080.

Investigating the thread where the crash occurred

So, use the .process /r /P 0xffffc18f45284080 command to change the debugger’s process context to the D4C.exe process.

If you run the !peb command again after changing the process context, you can confirm that the displayed information changes from the PEB for svchost.exe to the PEB for D4C.exe.

This process-context switch with .process /r /P <process object (EPROCESS) address> is a command that appears frequently when analyzing full memory dumps, so it is worth remembering.

Finally, with the process context changed to D4C.exe, run the k command again to output the stack backtrace. You can now inspect frames from before ntdll!NtTerminateProcess+0x14 was called.

6: kd> k
 # Child-SP          RetAddr               Call Site
00 ffffa209`4982f838 fffff801`2d70e592     nt!KeBugCheckEx
01 ffffa209`4982f840 fffff801`2d616045     nt!PspCatchCriticalBreak+0x10e
02 ffffa209`4982f8e0 fffff801`2d4819b0     nt!PspTerminateAllThreads+0x15e655
03 ffffa209`4982f950 fffff801`2d4817ac     nt!PspTerminateProcess+0xe0
04 ffffa209`4982f990 fffff801`2d2105f5     nt!NtTerminateProcess+0x9c
05 ffffa209`4982fa00 00007ff8`7accd3d4     nt!KiSystemServiceCopyEnd+0x25
06 00000087`d59ef568 00007ff8`789643d0     ntdll!NtTerminateProcess+0x14
07 00000087`d59ef570 00007ff6`e99012d3     KERNELBASE!TerminateProcess+0x30
08 00000087`d59ef5a0 00007ff6`e9901a40     D4C+0x12d3
09 00000087`d59ef860 00007ff8`7a497344     D4C+0x1a40
0a 00000087`d59ef8a0 00007ff8`7ac826b1     KERNEL32!BaseThreadInitThunk+0x14
0b 00000087`d59ef8d0 00000000`00000000     ntdll!RtlUserThreadStart+0x21

From this result, we can infer that the instruction immediately before offset 0x12d3 in D4C.exe likely called the TerminateProcess API and terminated svchost.exe.

Reading the instructions before and after the crash

From the analysis so far, we have confirmed that the instruction immediately before offset 0x12d3 in D4C.exe most likely called the TerminateProcess API and caused the system crash.

Next, we will investigate what the instructions before and after offset 0x12d3 actually looked like.

As in Chapter 4, we used the u command and the Disassembly window to retrieve the instructions around offset 0x12d3.

00007ff6`e99012b0 448b442448      mov     r8d,dword ptr [rsp+48h]
00007ff6`e99012b5 8d48f5          lea     ecx,[rax-0Bh]
00007ff6`e99012b8 33d2            xor     edx,edx
00007ff6`e99012ba ff15781d0000    call    qword ptr [D4C+0x3038 (00007ff6`e9903038)]
00007ff6`e99012c0 488bd8          mov     rbx,rax
00007ff6`e99012c3 4885c0          test    rax,rax
00007ff6`e99012c6 7416            je      D4C+0x12de (00007ff6`e99012de)
00007ff6`e99012c8 33d2            xor     edx,edx
00007ff6`e99012ca 488bc8          mov     rcx,rax
00007ff6`e99012cd ff155d1d0000    call    qword ptr [D4C+0x3030 (00007ff6`e9903030)]
00007ff6`e99012d3 488bcb          mov     rcx,rbx
00007ff6`e99012d6 ff158c1d0000    call    qword ptr [D4C+0x3068 (00007ff6`e9903068)]
00007ff6`e99012dc eb11            jmp     D4C+0x12ef (00007ff6`e99012ef)
00007ff6`e99012de 488d54246c      lea     rdx,[rsp+6Ch]
00007ff6`e99012e3 488d0d26410000  lea     rcx,[D4C+0x5410 (00007ff6`e9905410)]
00007ff6`e99012ea e821fdffff      call    D4C+0x1010 (00007ff6`e9901010)
00007ff6`e99012ef 488d542440      lea     rdx,[rsp+40h]
00007ff6`e99012f4 488bcf          mov     rcx,rdi
00007ff6`e99012f7 ff15431d0000    call    qword ptr [D4C+0x3040 (00007ff6`e9903040)]
00007ff6`e99012fd 85c0            test    eax,eax

In the dump file I captured, the image base address of D4C.exe is 0x00007ff6e9900000, so the virtual address of offset (RVA) 0x12d3 is 0x00007ff6e99012d3.

As an aside, calculations such as address arithmetic can also be performed with expression evaluation using the ? command, as shown below.

# Identify the image base address of D4C.exe
6: kd> ? !D4C
Evaluate expression: 140698457210880 = 00007ff6`e9900000

# Calculate the address of offset 0x12d3 in D4C.exe
6: kd> ? !D4C+0x12d3
Evaluate expression: 140698457215699 = 00007ff6`e99012d3

Looking at the disassembly above, we can see that the instruction immediately before offset 0x12d3, which was added to the stack backtrace, is call qword ptr [D4C+0x3030 (00007ff6e9903030)].

Also, based on the later stack backtrace information, we can infer that this instruction is probably calling the TerminateProcess API.

So let’s confirm from the dump file whether this function really does call the TerminateProcess function.

Analyzing the IAT (Import Address Table) in WinDbg

As briefly introduced in Chapter 3 of this book, programs (.exe files) executed on Windows systems are generally created in PE file format.

When a program is executed on Windows, the system loads various pieces of information from the PE file header and expands them into the process’s allocated memory space.

One of those pieces of information is called the IAT (Import Address Table).

To determine whether the function called at D4C+0x3030 is TerminateProcess, let’s inspect this IAT information.

First, use WinDbg to collect the header information of D4C.exe as it was expanded in memory from this dump file.

In WinDbg, the !dh extension4 lets you inspect the header information of a specific image.

However, when using the !dh extension to inspect the header information of a specific PE file from a full memory dump, you need to switch the context beforehand to the execution process of that PE file with the .process /r /P <process object (EPROCESS) address> command.

After changing the process context to D4C.exe, run the !dh -f !D4C command to retrieve the information.

6: kd> !dh -f !D4C

{{ omitted }}

   0 [       0] address [size] of Export Directory
5B40 [      C8] address [size] of Import Directory
9000 [     1E8] address [size] of Resource Directory
8000 [     1D4] address [size] of Exception Directory
   0 [       0] address [size] of Security Directory
A000 [      3C] address [size] of Base Relocation Directory
5610 [      70] address [size] of Debug Directory
   0 [       0] address [size] of Description Directory
   0 [       0] address [size] of Special Directory
   0 [       0] address [size] of Thread Storage Directory
54D0 [     140] address [size] of Load Configuration Directory
   0 [       0] address [size] of Bound Import Directory
3000 [     248] address [size] of Import Address Table Directory
   0 [       0] address [size] of Delay Import Directory
   0 [       0] address [size] of COR20 Header Directory
   0 [       0] address [size] of Reserved Directory

Because what we want to inspect here is the IAT, focus on the Import Address Table Directory line in the output above.

That line tells us that the IAT offset is 0x3000 and its size is 0x248.

Next, use the dps command5, which resolves the contents of memory within a specified range as a series of addresses in the symbol table, to resolve the IAT symbols.

This resolves the symbols in the IAT expanded into process memory, confirming that the function at virtual address 0x00007ff6e9903030, in other words D4C+0x3030, is KERNEL32!TerminateProcessStub.

# dps !<module_name>+<IAT address> !<module_name>+<IAT address>+<IAT size>

6: kd> dps !D4C+0x3000 !D4C+0x3000+0x248
00007ff6`e9903000  00007ff8`7ab57880 ADVAPI32!AdjustTokenPrivilegesStub
00007ff6`e9903008  00007ff8`7ab56920 ADVAPI32!OpenProcessTokenStub
00007ff6`e9903010  00007ff8`7ab4f970 ADVAPI32!LookupPrivilegeValueW
00007ff6`e9903018  00000000`00000000
00007ff6`e9903020  00007ff8`7a495f00 KERNEL32!GetLastErrorStub
00007ff6`e9903028  00007ff8`7a4a4ba0 KERNEL32!GetCurrentProcessId
00007ff6`e9903030  00007ff8`7a4a0a70 KERNEL32!TerminateProcessStub
00007ff6`e9903038  00007ff8`7a49b0f0 KERNEL32!OpenProcessStub
00007ff6`e9903040  00007ff8`7a4a2740 KERNEL32!Process32NextW
00007ff6`e9903048  00007ff8`7a4a29a0 KERNEL32!Process32FirstW
{{ omitted }}

Also, even without going out of your way to analyze the IAT in WinDbg like this, you can easily identify the function at D4C+0x3030 by using the automatic analysis features of a decompiler such as Ghidra.

Ghidra disassembly result

The disassembly at offset 0x3030 is automatically analyzed as well, making it easy to investigate that the function called here is TerminateProcess.

Ghidra disassembly result

In this way, rather than analyzing only with WinDbg, there are cases where analysis proceeds more efficiently by using multiple powerful tools such as decompilers and approaching the problem from different angles.

This completes the identification, through dump file analysis, that the system crash was caused by svchost.exe being terminated by the TerminateProcess API called from D4C.exe.

Chapter 5 summary

That concludes the analysis of this simple system crash dump.

As with the application crash dump analyzed in Chapter 4, the direct cause (exception) of the BSOD itself can be identified relatively easily from the dump file.

Of course, if you want to investigate the program behavior that caused that exception, it will rarely be identified as easily as it was in this chapter.

In such cases, not only analyzing the dump file with WinDbg but also tracing the environment while reproducing the problem with tools such as Process Monitor or packet capture, using source code or decompiled results for offline debugging, and even live debugging may help you investigate the cause more efficiently.