For my day job, I frequently look at reports that come out of WinQual from Microsoft. These reports contain crash dumps that I can use to determine what’s going wrong with the software I’ve been working on. All in all, it’s a fantastic system and I highly recommend any ISV sign up (especially considering that it’s free for anyone with properly signed executables). Recently, I received a crash dump which had an corrupted stack crawl. I wanted to cover how you can use WinDbg to reconstruct a corrupted stack crawl.
To start with, let’s see what the original stack looks like. To do this, we use the “k” command in WinDbg.
0:000> k ChildEBP RetAddr 028b89cc 77c75350 ntdll!KiFastSystemCallRet 028b89d0 77c4b208 ntdll!ZwTerminateProcess+0xc 028b89e0 763e41ec ntdll!RtlExitUserProcess+0x7a 028b89f4 10056386 kernel32!ExitProcess+0x12 WARNING: Stack unwind information not available. Following frames may be wrong. 028b89fc 100565a0 EyeOneIO!I1_SynchronizeWhitebases+0xf0f6 028b8a0c 10054803 EyeOneIO!I1_SynchronizeWhitebases+0xf310 00000000 00000000 EyeOneIO!I1_SynchronizeWhitebases+0xd573
There are a few tell-tale signs that this stack is corrupted. For starters, the base of the stack never starts at 0x00000000. Usually, it starts in a call to your executable’s main entrypoint or a thread entrypoint. But we see there’s no evidence of that on this stack. Also, the crawl tells you that something is amiss with the “Following frames may be wrong.” warning.
The first step to reconstructing the stack is to look at the current executing thread’s execution block to determine where the stack starts and stops. We do this using the !teb extension command, and receive:
0:000> !teb TEB at 7ffdb000 ExceptionList: 028b8a28 StackBase: 028c0000 StackLimit: 028b6000 SubSystemTib: 00000000 FiberData: 00001e00 ArbitraryUserPointer: 00000000 Self: 7ffdb000 EnvironmentPointer: 00000000 ClientId: 00000a4c . 00000e3c RpcHandle: 00000000 Tls Storage: 7ffdb02c PEB Address: 7ffdf000 LastErrorValue: 14007 LastStatusValue: c0150008 Count Owned Locks: 0 HardErrorMode: 0
The stack base and stack limits tell us what memory range is valid for our callstack — so now we can dump that memory location to look for items of interest. WinDbg has a handy command just for this, dds, which dumps the range of memory given and attempts to resolve each ptr-sized location as a symbol. The resulting dump contains three columns of data: the memory location being viewed, the ptr-sized value at that memory location, and the symbol’s name if the value could be resolved as a symbol.
My stack dump looks like this (with boring pieces elided):
028b6000 00000000 ... 028bf9d8 00000000 028bf9dc 00000000 028bf9e0 79035b7f 028bf9e4 028bfa1c 028bf9e8 6e760b5b i1IO!i1IO::measureOneStrip+0xbb 028bf9ec 42b840fc ... 028bfa18 00000000 028bfa1c 028bfd98 028bfa20 6e763387 i1IO!i1IO::_measureSingleRowScanThreaded+0x1467 028bfa24 42b840fc ... 028bfd94 00000006 028bfd98 028bfe2c 028bfd9c 6e761062 i1IO!i1IO::_advancedMeasureThreaded+0x222 028bfda0 013a8520 028bfda4 79035e2e ... 028bfe28 00000000 028bfe2c 028bfe38 028bfe30 763ed0e9 kernel32!BaseThreadInitThunk+0xe 028bfe34 012118e0 028bfe38 028bfe78 028bfe3c 77c516c3 ntdll!__RtlUserThreadStart+0x23 028bfe40 012118e0 ... 028bfe74 00000000 028bfe78 028bfe90 028bfe7c 77c51696 ntdll!_RtlUserThreadStart+0x1b 028bfe80 6e760e40 i1IO!i1IO::_advancedMeasureThreaded ... 028c0000 ????????
The actual stack dump is considerably larger than this, but I’ve kept only the relevant pieces for brevity.
The first thing you should do is locate something on the stack that looks like it should be fairly near the start of the stack. In this case, RtlUserThreadStart is a great place to begin from — it should the base of the thread’s stack. Once you’ve found your starting point, take the value from the first column and search for it. You should find a corresponding value (the second column) further up the stack. Take that memory location and search for it. Continue to search the stack in this way until your search comes back empty.
From our example, we start here:
028bfe78 028bfe90 028bfe7c 77c51696 ntdll!_RtlUserThreadStart+0x1b
Searching for 028bfe78 brings us to:
028bfe38 028bfe78 028bfe3c 77c516c3 ntdll!__RtlUserThreadStart+0x23
Searching for 028bfe38 brings us to:
028bfe2c 028bfe38 028bfe30 763ed0e9 kernel32!BaseThreadInitThunk+0xe
Searching for 028bfe2c brings us to:
028bfd98 028bfe2c 028bfd9c 6e761062 i1IO!i1IO::_advancedMeasureThreaded+0x222
Searching for 028bfd98 brings us to:
028bfa1c 028bfd98 028bfa20 6e763387 i1IO!i1IO::_measureSingleRowScanThreaded+0x1467
Searching for 028bfa1c brings us to:
028bf9e4 028bfa1c 028bf9e8 6e760b5b i1IO!i1IO::measureOneStrip+0xbb
Then, searching for 028bf9e4 brings us nowhere. So we’ve reached a point where we can try to have WinDbg fix our stack for us by giving it a hint where to look. Using the L parameter and passing in the last value we searched for should give us better results.
0:000> k L=028bf9e4 ChildEBP RetAddr 028b89cc 77c75350 ntdll!KiFastSystemCallRet 028b89d0 77c4b208 ntdll!ZwTerminateProcess+0xc 028bf9e4 6e760b5b ntdll!RtlExitUserProcess+0x7a 028bfa1c 6e763387 i1IO!i1IO::measureOneStrip+0xbb 028bfd98 6e761062 i1IO!i1IO::_measureSingleRowScanThreaded+0x1467 028bfe2c 763ed0e9 i1IO!i1IO::_advancedMeasureThreaded+0x222 028bfe38 77c516c3 kernel32!BaseThreadInitThunk+0xe 028bfe78 77c51696 ntdll!__RtlUserThreadStart+0x23 028bfe90 00000000 ntdll!_RtlUserThreadStart+0x1b
Now that looks like a much healthier stack crawl! No warnings about incorrect frames, and the base of the call stack looks sane. Hopefully you can use this trick at some point in your own debugging scenarios.