Friday, July 30, 2010

Ethos KGDB

So one of the many things I am working in my spare time is adding Ethos kernel debugging support via the gdb debugger. The gdb remote debugging protocol is quite simple making initial bringup very simple and error prone. The protocol is all ASCII, with values encoded in hexadecimal and commands being simple characters. The commands are a pretty simple lot - reading/modifying memory and registers. Compare this to the KD protocol, which is binary and expects NT kernel structures to be available for reading by the debugger core, making the implementation pretty verbose to use for debugging targets other than ntoskrnl through WinDbg/KD. I suppose the difference is the result of a debugger growing around a specific OS and need, rather than just around generic target debugging support. Of course, once the basic CPU state manipulation works in Ethos, kernel-specific commands will need to be added, to handle address space switches, for example, or physical address space reads. In either case, all Ethos-specific support will be versioned and unnecessary for the basic kernel panic debugging, which is the primary motivator for remote debugging in the first place...

So Ethos is a kernel that runs in a Xen paravirtualized virtual machine. That means that whereas on real hardware you could have a serial UART, 1394 adapter, or Ethernet, within a paravirtualized machine you only have virtual devices exposed by Dom0. Obviously, the debugging support code path should be pretty small and should not tie into too many components in the debugged kernel, otherwise you suddenly lose the ability to step through a fair amount of code. The only virtual device that can be accessed almost immediately after bootup with minimal pain is the console device. All other virtual device initialization relies on being able to access XenBus, a bus abstraction for virtual devices to communicate between virtual machines, used for configuration negotiation. Xenbus/XenStore is implemented at a high-level as a namespace with file/directory access semantics, and lets Xen domains exchange small bits of information.

Disregarding the obvious question of why Xen doesn't expose a couple of hypercalls as a debugger channel, like the Hyper-V hypervisor does, we're left trying to see which is the minimum impact communication channel. Using the console is the simplest and lets you debug when nothing else is even working yet (like when you're trying to bring up on x64 ;-)). On the other hand, if you're interested in debugging a live non-crashed kernel, then you might not wish to clobber the kernel console, which is used for kernel messages. The approach I am taking is abstracting the packet transport from the actual debugging engine, and implementing a console transport and a XenBus transport, where a particular key is used to exchange packets between debugger and kernel. The console transport will be used for early boot debugging and kernel panics.

No comments:

Post a Comment