In order to use the
.kdfiles
command to replace a boot driver on the target for NT kernel debugging, it is necessary to use the ntldr "boot debugger", BD. This is because ntldr, not the NT kernel, is responsible for reading these drivers off of disk and loading them into memory: by the time the ordinary kernel debugging connection is established, it is far too late to do anything about the boot drivers.
Ordinarily, the way you are supposed to do this is by using a special version of ntldr which automatically waits for a kernel debugging connection every time you start it. This version is included in the Driver Development Kit with the name ntldr_dbg. Unfortunately, each DDK only includes the ntldr_dbg for one version of Windows; there's no telling what might happen if e.g. you try to use the Windows Server 2003 SP1 ntldr_dbg with XP SP2. (At least, I didn't want to risk trying it. Maybe it would work fine.)
However, it turns out that there is another way: at least in the checked build of ntldr, it is possible to enable the boot debugger using an undocumented section in boot.ini. For example, I just started it with a boot.ini that begins:
[boot loader]
timeout=30
default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS
[debug]
/debug /debugport=COM1 /baudrate=115200 /debugstop
[operating systems]
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="XP with Serial Debugging" /fastdetect /debug /debugport=COM1 /baudrate=115200
Warning: these flags will make it impossible to boot without attaching a kernel debugger to your system. It is therefore vital that you:
- Have access to a second system on which to run the kernel debugger
- Have a null-modem cable
- Remember to remove the option when you're done, and while you still have the second system handy!
If you end up with a system that doesn't boot, don't say I didn't warn you... but you could probably fix things by using e.g. a copy of Knoppix to take out the offending lines of boot.ini.
The /DEBUGBREAK flag breaks into the debugger immediately after it is initialized, allowing you to single-step through the rest of the boot process.
Note that in order to actually debug the bootloader, you will probably need to do something like the following after extracting the embedded PE image (internally called osloader.exe) from your ntldr:
kd> .readmem C:\code\deadweight-syms\ntldr.chk.pe 0x400000 L0x1000
Reading 1000 bytes..
kd> .imgscan /l /r 00400000
MZ at 00400000 - size 81000
Name: osloader.EXE
Loaded osloader.EXE module
The value 0x400000 in these commands is the address at which the PE image got loaded; in my case, osloader.exe was a non-relocatable image with the usual image base address of 0x400000. I needed to run these commands because the ntldr I tried this on, the one from the checked build of Windows XP SP2, (a) did not have the PE header in memory at the time it broke into the debugger and (b) did not tell the debugger about it's base address and so on. There was a call to DbgLoadImageSymbols immediately prior to the breakpoint, but the DbgLoadImageSymbols in question did not actually do anything. I'm guessing that things are different if you actually use the version distributed as ntldr_dbg.
Anyway, once you do the above dance, you should get symbols loaded just fine, assuming you've got the MS Symbol Server in your symbol path.