Tuesday, October 03, 2006 12:51 AM bart

A beginner's guide to CorDbg

Introduction

To some people, debugging has a bad connotation. And in quite some cases I can imagine why. Lots of people find it frustrating to chase down a bug, especially when they lack source code access (production debugging). However, debugging can be fun too. And the nice thing is that although tools like Visual Studio 2005 have a nice debugging experience, you can go much further by using more low-level tools. Not only for debugging, but also to get to know the system better, this can be a great exploration.

In this post we'll focus on cordbg.exe, a debugger that ships with the .NET Framework SDK and the more recent Windows SDK (I'll use the latter one). The goal of the post isn't to tackle a real bug chasing scenario, but rather to explain some of the commands that cordbg.exe has aboard and to stimulate the readers' taste and love for low-level debugging.

Why not mdbg? First of all, the basics are the same. Secondly, mdbg (the "managed debugger") was written in the 2.0 timeframe and has still slightly less features than cordbg. Although it has other interesting commands like fo[reach] and uwgc[handle] to name two, I'll stick with cordbg for now. Likely I'll write about mdbg too somewhere in the future, so stay tuned.

Official documentation on cordbg can be found on MSDN, over here.

A debug victim

Before we can start debugging, we need to have some debugging target. For the sake of the demo, we'll just create a very simplistic application:

using System;

class
Dbg
{
   public static void
Main()
   {
      Do();
   }

   static void
Do()
   {
      int
i = 6;
      int
j = 2;
      int
k = Div(i, j);
      Console
.WriteLine(k);
   }

   static int Div(int i, int
j)
   {
      return
i / j;
   }
}

Compile to dbg.exe, but make sure to generate the symbols:

C:\temp>csc /debug+ dbg.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.


C:\temp>dir dbg*.*
 Volume in drive C is Windows Vista
 Volume Serial Number is C4FB-B1E7

 Directory of C:\temp

26/09/2006  00:52               302 dbg.cs
26/09/2006  00:55             3.584 dbg.exe
26/09/2006  00:55            13.824 dbg.pdb
               3 File(s)         17.710 bytes
               0 Dir(s)  14.003.908.608 bytes free

 You're now ready to enter the wonderful world of cordbg. Fasten your seatbelts!

Bug hunting

Start CorDbg

Right, time to start cordbg. This should bring up the following:

Microsoft (R) Common Language Runtime Test Debugger Shell Version 2.0.50727.42 (
RTM.050727-4200)
Copyright (C) Microsoft Corporation. All rights reserved.

(cordbg)

You're now looking at the cordbg debugging prompt.

Launch an application

The next step is to launch the application we want to debug. We can do this by using the r command:

(cordbg) r dbg.exe
Process 9988/0x2704 created.
Warning: couldn't load symbols for C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__
b77a5c561934e089\mscorlib.dll
[thread 0x2a10] Thread created.

006:     {

Cool, we're in business. The line you're seeing at the bottom is an actual source code line. In this case, line 6 contains a single curly brace:

using System;

class
Dbg
{
   public static void
Main()
   {

There are other options to start a debugging session as well. For example, you could just run cordbg dbg.exe to get the same result. If you want to start debugging a running process, you can attach to the process like this:

(cordbg) a 9336
Process 9336/0x2478 created.
Warning: couldn't load symbols for C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__
b77a5c561934e089\mscorlib.dll
[thread 0x2664] Thread created.

a stands for attach and 9336 is the PID of the process to be debugged. An associated command is attachn to attach by process name. Another interesting one is pro to show all processes that are running managed code:

(cordbg) pro
PID=0x2e24 (11812)  Name=C:\temp\dbg.exe, version='v2.0.50727'
        ID=1  AppDomainName=dbg.exe

PID=0x28dc (10460)  Name=C:\Program Files\Microsoft SDKs\Windows\v6.0\Bin\cordbg
.exe, version='v2.0.50727'
        ID=1  AppDomainName=cordbg.exe

PID=0x2a1c (10780)  Name=c:\Program Files\FSharp-1.1.12.5\bin\fsi.exe, version='
v2.0.50727'
        ID=1  AppDomainName=fsi.exe

PID=0x220c (8716)  Name=C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\d
evenv.exe, version='v2.0.50727'
        ID=1  AppDomainName=DefaultDomain

I won't cover these commands in much detail over right now.

Viewing code

We have the dbg.cs file available and the dbg.pdb symbol file, so our debugger knows more than just some addresses. For example, we can view the code we're currently at:

(cordbg) sh
001: using System;
002:
003: class Dbg
004: {
005:     public static void Main()
006:*    {
007:         Do();
008:     }
009:
010:     static void Do()
011:     {

Notice the asterisk (*) to indicate the current line of execution.

Appdomains, threads, ...

Just as an example of diagnostic information, it can be of interest to list the appdomains and threads that are currently alive. This is done by use of the command ap and t:

(cordbg) ap

1) * AppDomainName = <dbg.exe>
        DebugStatus: <Debugger Attached >
        ID: 1
        Assembly Name : C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c56193
4e089\mscorlib.dll
                Module Name : C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5
c561934e089\mscorlib.dll
        Assembly Name : C:\temp\dbg.exe
                Module Name : C:\temp\dbg.exe
(cordbg) t
*Thread 0x195c R  at Dbg::Main +0014[native] +0000[IL] in c:\temp\dbg.cs:6

Quite some interesting information lives herein such as loaded modules and entrypoints of threads started by the app. In our case this is not really of much interest, but using commands like su and re you can suspend/resume a given thread.

Disassembly

Managed or not, processors require machine code. You can see the disassembly of the current method by using dis:

(cordbg) dis
Function Dbg.Main (code starts at 0x7c9cc8).
Offsets are relative to function start.
 [0000] cmp         dword ptr ds:[001F2DE4h],0
 [0007] je          00000007
 [0009] call        798DBA7D
*[000e] nop
 [000f] call        dword ptr ds:[001F3054h]
 [0015] nop
 [0016] nop
 [0017] ret

Notice the asterisk once more and see the use of nop instructions when debugging. The next call that's going to happen is a call to address 0x001F3054, which is the address of Dbg.Do.

Quiz: What could be the purpose of the call on line [0009]?

Breakpoints

Debugging without breakpoints would be rather awkward. So, let's set a breakpoint to break on Do:

(cordbg) b Dbg::Do
Breakpoint #1 has bound to C:\temp\dbg.exe.
#1      C:\temp\dbg.exe!Dbg::Do:0       Do+0x0(il) [active]

Using the b command you can set and view breakpoints. You can also set breakpoints on lines, e.g. line 20 (the return statement in method Div):

(cordbg) b 20
Breakpoint #2 has bound to C:\temp\dbg.exe.
#2      c:\temp\dbg.cs:20       Div+0x1(il) [active]

And last but not least to a line offset in a method, e.g. line 3 of Dbg::Do:

(cordbg) b Dbg::Do:3
Breakpoint #3 has bound to C:\temp\dbg.exe.
#3      C:\temp\dbg.exe!Dbg::Do:3       Do+0x3(il) [active]

Go!

To run the process till a breakpoint is hit, use g:

(cordbg) g
break at #1     C:\temp\dbg.exe!Dbg::Do:0       Do+0x0(il) [active]

011:     {
(cordbg) g
break at #3     C:\temp\dbg.exe!Dbg::Do:3       Do+0x3(il) [active]

013:         int j = 2;

Of course, sh can come to the rescue (if you have source code available!) to find out the exact position in a code-wise fashion:

(cordbg) sh
001: using System;
002:
003: class Dbg
004: {
005:     public static void Main()
006:     {
007:         Do();
008:     }
009:
010:     static void Do()
011:     {
012:         int i = 6;
013:*        int j = 2;
014:         int k = Div(i, j);
015:         Console.WriteLine(k);
016:     }
017:
018:     static int Div(int i, int j)
019:     {
020:         return i / j;
021:     }
022: }

The stack trace

Where the hell are we on the current execution path? A common question with a nearby answer. w[here]:

(cordbg) w
Thread 0x27bc Current State:Normal
0)* Dbg::Do +0030[native] +0003[IL] in c:\temp\dbg.cs:13
1)  Dbg::Main +0021[native] +0006[IL] in c:\temp\dbg.cs:7

Notice the line numbers being available thanks to availability of the .cs file. If you'd only have the debugging symbols, only the friendly method names would remain.

Function evaluation

Right, the is some Div method being called according to the source dump above. What about finding out what the function does by calling it? Enter f:

(cordbg) f Dbg::Div 10 3
break at #2     c:\temp\dbg.cs:20       Div+0x1(il) [active]

020:         return i / j;

Ow, right we had a breakpoint on line 20, did we? Just ask the debugger to continue:

(cordbg) g
Function evaluation complete.
$result=3

013:         int j = 2;

And we're back in reality, where we were before evaluation of the function.

Step by step

Let's continue execution by one line: s will do the trick:

(cordbg) s

014:         int k = Div(i, j);

Other commands to step through code include ss to step to the next native or IL instruction (a C# line can - and often does - map to more than one IL instruction, which on its turn maps to more than one native instruction) and so to step over (e.g. to bypass a method call).

Viewing and modifying state

Hmm, what values do our locals have right now? Again simple: p does the trick:

(cordbg) p
i=6
j=2
k=0

So, what if we'd change a variable while debugging? No problem at all using set:

(cordbg) set j 0
j=0

Diving deeper: the registers

Time to resume execution, so press g which brings us to the breakpoint on line 20 (inside Div):

(cordbg) g
break at #2     c:\temp\dbg.cs:20       Div+0x1(il) [active]

020:         return i / j;

Interested to see the native code? Recall dis:

(cordbg) dis
Function Dbg.Div (code starts at 0xe59d40).
Offsets are relative to function start.
 [0007] cmp         dword ptr ds:[00722DE4h],0
 [000e] je          00000007
 [0010] call        7924B9FE
 [0015] xor         edi,edi
 [0017] nop
*[0018] mov         eax,ebx
 [001a] cdq
 [001b] idiv        eax,esi
 [001d] mov         edi,eax
 [001f] nop
 [0020] jmp         00000002

This code can learn you a bit about calling conventions too. Let's take a look at the registers:

(cordbg) reg
Thread 0x27bc:
EIP = 00e59d58 ESP = 0015ed20 EBP = 00000000 EAX = 00723000 ECX = 00000006
EDX = 00000000 EBX = 00000006 ESI = 00000000 EDI = 00000000
ST0 = -1.#IND ST1 = -1.#IND ST2 = -1.#IND ST3 = -1.#IND ST4 = -1.#IND
ST5 = -1.#IND ST6 = -1.#IND ST7 = -1.#IND
EFL = 0246 CS = 001b DS = 0023 ES = 0023 FS = 003b
GS = 0000 SS = 0023 CY = 0 PE = 1 AC = 0
ZR = 1 PL = 0 EI = 1 UP = 0 OV = 0

Dr0 = 00000000 Dr1 = 00000000 Dr2 = 00000000
Dr3 = 00000000 Dr6 = 00000000 Dr7 = 00000000
ControlWord = ffff027f StatusWord = ffff0120 TagWord = ffffffff
ErrorOffset = 79e8aaa7 ErrorSelector = 06d9001b DataOffset = 0015e208
DataSelector = ffff0023 Cr0NpxState = 00000000

In purple you can see the current instruction pointer. The dis call reported that code for the current method starts on 0xe59d40 and the sh code dump revealed we're breaking on line 18. Taking that together you end up with the EIP value.

I won't explain every register in detail but just watch what happens if we execute native instruction per native instruction using ss:

(cordbg) ss
[001a] cdq
(cordbg) reg
Thread 0x27bc:
EIP = 00e59d5a ESP = 0015ed20 EBP = 00000000 EAX = 00000006 ECX = 00000006
EDX = 00000000 EBX = 00000006 ESI = 00000000 EDI = 00000000
ST0 = -1.#IND ST1 = -1.#IND ST2 = -1.#IND ST3 = -1.#IND ST4 = -1.#IND
ST5 = -1.#IND ST6 = -1.#IND ST7 = -1.#IND
EFL = 0246 CS = 001b DS = 0023 ES = 0023 FS = 003b
GS = 0000 SS = 0023 CY = 0 PE = 1 AC = 0
ZR = 1 PL = 0 EI = 1 UP = 0 OV = 0

Dr0 = 00000000 Dr1 = 00000000 Dr2 = 00000000
Dr3 = 00000000 Dr6 = 00000000 Dr7 = 00000000
ControlWord = ffff027f StatusWord = ffff0120 TagWord = ffffffff
ErrorOffset = 79e8aaa7 ErrorSelector = 06d9001b DataOffset = 0015e208
DataSelector = ffff0023 Cr0NpxState = 00000000

Ignore the cdq (Convert Double to Quad) instruction, this will just expand the value in EAX to become twice that size spanning over EDX and EAX. As EDX already was 0000000 you won't see a real effect. So, ss once more:

(cordbg) ss
[001b] idiv        eax,esi
(cordbg) reg
Thread 0x27bc:
EIP = 00e59d5b ESP = 0015ed20 EBP = 00000000 EAX = 00000006 ECX = 00000006
EDX = 00000000 EBX = 00000006 ESI = 00000000 EDI = 00000000
ST0 = -1.#IND ST1 = -1.#IND ST2 = -1.#IND ST3 = -1.#IND ST4 = -1.#IND
ST5 = -1.#IND ST6 = -1.#IND ST7 = -1.#IND
EFL = 0246 CS = 001b DS = 0023 ES = 0023 FS = 003b
GS = 0000 SS = 0023 CY = 0 PE = 1 AC = 0
ZR = 1 PL = 0 EI = 1 UP = 0 OV = 0

Dr0 = 00000000 Dr1 = 00000000 Dr2 = 00000000
Dr3 = 00000000 Dr6 = 00000000 Dr7 = 00000000
ControlWord = ffff027f StatusWord = ffff0120 TagWord = ffffffff
ErrorOffset = 79e8aaa7 ErrorSelector = 06d9001b DataOffset = 0015e208
DataSelector = ffff0023 Cr0NpxState = 00000000

Now we'll execute the division. Hmm, prepared to see some fire? Division by zero you know...

(cordbg) ss
First chance exception generated: (0x156210c) <System.DivideByZeroException>
Unhandled exception generated: (0x156210c) <System.DivideByZeroException>
  _className=<null>
  _exceptionMethod=<null>
  _exceptionMethodString=<null>
  _message=(0x1575bb0) "Attempted to divide by zero."
  _data=<null>
  _innerException=<null>
  _helpURL=<null>
  _stackTrace=(0x1575cbc) <System.SByte[]>
  _stackTraceString=<null>
  _remoteStackTraceString=<null>
  _remoteStackIndex=0
  _dynamicMethods=<null>
  _HResult=-2147352558
  _source=<null>
  _xptrs=1435884
  _xcode=-1073741676
Exception is called:UNHANDLED
[001b] idiv        eax,esi

And even deeper: the memory

The ideal moment to peek in memory using some direct address. What would you think of looking at the (indicated in purple) bytes of the string message? du[mp] will do:

(cordbg) du 0x1575bb0 10
0x1575bb0: 00749310 0000001d 0000001c 00740041 00650074
0x1575bc4: 0070006d 00650074 00200064 006f0074 00640020

Can you see the string? I do. This is how. Start at address 0x1575bc, i.e. the group 00740041. Time to know your ASCII: 0x74 is 't', '41' is A. Strings in .NET are Unicode, which explains the 00 bytes in between. The x86 processor is little endian, thus we read this in a reverse order: "At". The next group 00650074 translates to "te", and so on.

Now we're in the middle of an exception it might be interesting (depends on how geeky you are) to inspect registers etc once more. However, let's just bail out by trying to go to the next statement (n):

(cordbg) n

Unhandled Exception: System.DivideByZeroException: Attempted to divide by zero.
   at Dbg.Div(Int32 i, Int32 j) in c:\temp\dbg.cs:line 20
   at Dbg.Do() in c:\temp\dbg.cs:line 14
   at Dbg.Main() in c:\temp\dbg.cs:line 7
Process exited.

We're dead now indeed.

A few other useful commands

Use u (up) and d (down) to "navigate" on the stack:

(cordbg) u

007:         Do();
(cordbg) w
Thread 0x13a8 Current State:Normal
0)  Dbg::Do +0035[native] +0005[IL] in c:\temp\dbg.cs:14
1)* Dbg::Main +0021[native] +0006[IL] in c:\temp\dbg.cs:7
(cordbg) d

014:         int k = Div(i, j);
(cordbg) w
Thread 0x13a8 Current State:Normal
0)* Dbg::Do +0035[native] +0005[IL] in c:\temp\dbg.cs:14
1)  Dbg::Main +0021[native] +0006[IL] in c:\temp\dbg.cs:7

Take a look at the asterisk indicating the stack position.

Conclusion

I hope to have shown you the other - much unknown - side of debugging in quite an interesting hands-on fashion. Getting to know the commands used in this "virtual debugging session" is key to be successful with cordbg. In a later post I might dive deeper and open up WinDbg with SOS extensions, but time will tell.

In the meantime, happy cordbging!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

No Comments