In case anyone got frustrated at repo bailing at git errors for a particular project...here you go! Hopefully this gets merged in.
Friday, July 30, 2010
Ethos KGDB
So one of the many things I am working in my spare time is adding Ethos kernel debugging support via the gdb debugger. The gdb remote debugging protocol is quite simple making initial bringup very simple and error prone. The protocol is all ASCII, with values encoded in hexadecimal and commands being simple characters. The commands are a pretty simple lot - reading/modifying memory and registers. Compare this to the KD protocol, which is binary and expects NT kernel structures to be available for reading by the debugger core, making the implementation pretty verbose to use for debugging targets other than ntoskrnl through WinDbg/KD. I suppose the difference is the result of a debugger growing around a specific OS and need, rather than just around generic target debugging support. Of course, once the basic CPU state manipulation works in Ethos, kernel-specific commands will need to be added, to handle address space switches, for example, or physical address space reads. In either case, all Ethos-specific support will be versioned and unnecessary for the basic kernel panic debugging, which is the primary motivator for remote debugging in the first place...
So Ethos is a kernel that runs in a Xen paravirtualized virtual machine. That means that whereas on real hardware you could have a serial UART, 1394 adapter, or Ethernet, within a paravirtualized machine you only have virtual devices exposed by Dom0. Obviously, the debugging support code path should be pretty small and should not tie into too many components in the debugged kernel, otherwise you suddenly lose the ability to step through a fair amount of code. The only virtual device that can be accessed almost immediately after bootup with minimal pain is the console device. All other virtual device initialization relies on being able to access XenBus, a bus abstraction for virtual devices to communicate between virtual machines, used for configuration negotiation. Xenbus/XenStore is implemented at a high-level as a namespace with file/directory access semantics, and lets Xen domains exchange small bits of information.
Disregarding the obvious question of why Xen doesn't expose a couple of hypercalls as a debugger channel, like the Hyper-V hypervisor does, we're left trying to see which is the minimum impact communication channel. Using the console is the simplest and lets you debug when nothing else is even working yet (like when you're trying to bring up on x64 ;-)). On the other hand, if you're interested in debugging a live non-crashed kernel, then you might not wish to clobber the kernel console, which is used for kernel messages. The approach I am taking is abstracting the packet transport from the actual debugging engine, and implementing a console transport and a XenBus transport, where a particular key is used to exchange packets between debugger and kernel. The console transport will be used for early boot debugging and kernel panics.
So Ethos is a kernel that runs in a Xen paravirtualized virtual machine. That means that whereas on real hardware you could have a serial UART, 1394 adapter, or Ethernet, within a paravirtualized machine you only have virtual devices exposed by Dom0. Obviously, the debugging support code path should be pretty small and should not tie into too many components in the debugged kernel, otherwise you suddenly lose the ability to step through a fair amount of code. The only virtual device that can be accessed almost immediately after bootup with minimal pain is the console device. All other virtual device initialization relies on being able to access XenBus, a bus abstraction for virtual devices to communicate between virtual machines, used for configuration negotiation. Xenbus/XenStore is implemented at a high-level as a namespace with file/directory access semantics, and lets Xen domains exchange small bits of information.
Disregarding the obvious question of why Xen doesn't expose a couple of hypercalls as a debugger channel, like the Hyper-V hypervisor does, we're left trying to see which is the minimum impact communication channel. Using the console is the simplest and lets you debug when nothing else is even working yet (like when you're trying to bring up on x64 ;-)). On the other hand, if you're interested in debugging a live non-crashed kernel, then you might not wish to clobber the kernel console, which is used for kernel messages. The approach I am taking is abstracting the packet transport from the actual debugging engine, and implementing a console transport and a XenBus transport, where a particular key is used to exchange packets between debugger and kernel. The console transport will be used for early boot debugging and kernel panics.
Thursday, July 29, 2010
Ethos
Two years after I last touched the Ethos kernel, I am back at hacking away at what is now referred to as the cardboard Ethos, while we are looking ahead to developing the next incarnation in a safer programming language than C. Right now, that language might be Go.
What is Ethos?
For the moment, I am the architecture and memory management code owner. That effectively means cleaning up the implementation and architecture of code dealing with physical and virtual address space manipulation, scheduling and process management, and other low-lying bits. It took a bit of time to bring myself up to speed with code that I had written just a few years ago (which is kinda sad...), but I place the blame squarely on not having it done cleanly enough the first time (not that that was or is a priority at this stage in the project ;-)). Aside from cleaning up the tree and fixing bugs, I am adding kernel gdb debugging over Ethernet, and hopefully porting Ethos to x64 if work and personal life allow ;-).
Today I've added a stack unwinder / backtrace that can handle stepping over the interrupt context, so that panic logs are more useful than seeing a backtrace up to the exception handler frame.
Now on a BUG() say occuring within the timer interrupt path, you would get something like the following -
Terminating Ethos Kernel
Backtrace:
[0xc00236f0] timer_handler + 21
[0xc000726f] xen_event_handle + ec
[0xc0007b0f] do_hypervisor_callback + a7
[0xc00030a6] hypervisor_callback + 35
-----> Next frame returning from interupt context is kernel space <-----
[0xc0005cc1] memset + 24
[0xc0010e4c] elfLoad + 390
[0xc001221c] scheduleInit + 214
[0xc00156a2] start_kernel + 1dd
[0xc000000e] stack_start + 0
What is Ethos?
For the moment, I am the architecture and memory management code owner. That effectively means cleaning up the implementation and architecture of code dealing with physical and virtual address space manipulation, scheduling and process management, and other low-lying bits. It took a bit of time to bring myself up to speed with code that I had written just a few years ago (which is kinda sad...), but I place the blame squarely on not having it done cleanly enough the first time (not that that was or is a priority at this stage in the project ;-)). Aside from cleaning up the tree and fixing bugs, I am adding kernel gdb debugging over Ethernet, and hopefully porting Ethos to x64 if work and personal life allow ;-).
Today I've added a stack unwinder / backtrace that can handle stepping over the interrupt context, so that panic logs are more useful than seeing a backtrace up to the exception handler frame.
Now on a BUG() say occuring within the timer interrupt path, you would get something like the following -
Terminating Ethos Kernel
Backtrace:
[0xc00236f0] timer_handler + 21
[0xc000726f] xen_event_handle + ec
[0xc0007b0f] do_hypervisor_callback + a7
[0xc00030a6] hypervisor_callback + 35
-----> Next frame returning from interupt context is kernel space <-----
[0xc0005cc1] memset + 24
[0xc0010e4c] elfLoad + 390
[0xc001221c] scheduleInit + 214
[0xc00156a2] start_kernel + 1dd
[0xc000000e] stack_start + 0
Wednesday, July 28, 2010
ARM11 Technical Documentation
It's excruciatingly frustrating to navigate the ARM website. There are a bajillion similar-sounding documents each containing minor or chip revision-specific minutiae, when all you are really trying to find is a generic overview of architecture for system programming.
In case anyone needs it - ARM11 MPCore™ Processor Technical Reference Manual
Let's just hope the ARM website isn't like MSDN, which recycles links for no apparently good reason more often than you would think is reasonable...
In case anyone needs it - ARM11 MPCore™ Processor Technical Reference Manual
Let's just hope the ARM website isn't like MSDN, which recycles links for no apparently good reason more often than you would think is reasonable...
Tuesday, July 27, 2010
Virtualizing ARM with TrustZone
Newer ARM processors come with security extensions called TrustZone. TrustZone is designed to enable a secure environment for software. Effectively, TrustZone extensions "splits" an ARM processor in two domains of operation - secure and a non-secure. Each domain from the point of view of TrustZone non-aware code is generally identical. Each domain has 7 modes of operation (usr, sys, svc, irq, fiq, abt, und), and the secure domain also has an 8th mode - mon, which is meant for secure monitor code. Which domain the CPU is executing in is controlled by the NS bit in CP15 register c1. Most of the system control registers are banked, thus code in secure and non-secure domain more-or-less unaware of the other. The 32-bit physical address space is further extended by a bit, the secure bit, creating two separate physical address spaces, enabling memory-mapped secure devices as well as accessing secure-only memory. With appropriate hardware support, one can route specific interrupts to the secure domain, or carve out a a "secure" area of RAM. The usage scenario for TrustZone is for creating a secure nucleus which can be used as a boot-time root of trust and as foundation for security in a system. The code running in secure mode effectively owns the hardware in the system, and has access to secure and non-secure domains. The non-secure domain is largely unaware of the secure domain and in a well-designed system cannot access any secure resources.
An idea which immediately comes to mind is that TrustZone extensions could be used to simplify virtualization - after all, it enables running code in a "virtual" ARM processor. The secure mode would be used for the hypervisor, while the non-secure domain would be used for virtual machines. In practice, however, TrustZone in its current implementation was never really meant for generic virtualization.
The first problem which immediately comes to mind, is physical address space protection. Most SoCs containing TrustZone support that I have seen so far generally have a facility for "carving out" a region of RAM, making it visible in secure physical address space, which hiding it from the non-secure address space. This allows the secure code and data to be inaccessible from non-secure mode. However, if you have several VMs running, you will not be able to protect the physical address space of each from the other VMs. Worse, for existing designs all hardware is accessible via the non-secure domain, so the is no way to isolate a VM from messing with the physical state. Additionally, if you're basing your hypervisor on top of an existing kernel like Linux, defining the secure region in terms of base and length is fairly difficult, unless you hide half (or whatever) of RAM using something like the mem= boot parameter. All of these physical protection issues are not CPU issues and are solvable with a custom memory controller, which effectively will be an additional MMU, with physical->machine address tables describing the secure physical address space and the non-secure physical address space. However, unless you are designing a new SoC and device around it, you have to live with no physical address space protection between OSes. Effectively, that means that the non-secure domain has to run your own code, that will ensure that physical memory belonging to the secure domain or other OSes is not trampled - i.e. paravirtualization.
The other 90% of the iceberg does end up being an ARM issue. See, the secure monitor mode is meant for a secure monitor, which facilitates switching between secure and non-secure domains on a "secure monitor call" or interrupt. Code operating in the mon mode can access the banked R13/R14/SPSR registers in any secure mode. This is done by allowing secure code to transition to mon from any other mode. Now, code operating in the mon mode can toggle the NS bit to access non-secure versions of the control registers, but you can't access the non-secure banked R13 (stack), banked R14 (link) and banked SPSR registers. Not without resorting to large-overhead hacks (more on this later). So if you wanted to use TrustZone non-secure mode for VMs, you wouldn't be (easily) able to switch those while scheduling. Of course when coupled with the physical address space protection issue, this issue might just nudge you towards porting, say, the Xen Hypervisor to run as the non-secure OS and taking it from there :-).
Using TrustZone to run Xen side-by-side with, say, a Linux, Symbian or NT has definite advantages - no need to modify the "host" OS other than loading a driver implementing a TrustZone monitor, and pretty transparent and fast switches to the Xen hypervisor and thus other OSes. This provides a solution for devices where replacing the bundled OS or booting a third-party kernel is not an option... which as far as ARM devices go nowadays, is almost all of them.
An idea which immediately comes to mind is that TrustZone extensions could be used to simplify virtualization - after all, it enables running code in a "virtual" ARM processor. The secure mode would be used for the hypervisor, while the non-secure domain would be used for virtual machines. In practice, however, TrustZone in its current implementation was never really meant for generic virtualization.
The first problem which immediately comes to mind, is physical address space protection. Most SoCs containing TrustZone support that I have seen so far generally have a facility for "carving out" a region of RAM, making it visible in secure physical address space, which hiding it from the non-secure address space. This allows the secure code and data to be inaccessible from non-secure mode. However, if you have several VMs running, you will not be able to protect the physical address space of each from the other VMs. Worse, for existing designs all hardware is accessible via the non-secure domain, so the is no way to isolate a VM from messing with the physical state. Additionally, if you're basing your hypervisor on top of an existing kernel like Linux, defining the secure region in terms of base and length is fairly difficult, unless you hide half (or whatever) of RAM using something like the mem= boot parameter. All of these physical protection issues are not CPU issues and are solvable with a custom memory controller, which effectively will be an additional MMU, with physical->machine address tables describing the secure physical address space and the non-secure physical address space. However, unless you are designing a new SoC and device around it, you have to live with no physical address space protection between OSes. Effectively, that means that the non-secure domain has to run your own code, that will ensure that physical memory belonging to the secure domain or other OSes is not trampled - i.e. paravirtualization.
The other 90% of the iceberg does end up being an ARM issue. See, the secure monitor mode is meant for a secure monitor, which facilitates switching between secure and non-secure domains on a "secure monitor call" or interrupt. Code operating in the mon mode can access the banked R13/R14/SPSR registers in any secure mode. This is done by allowing secure code to transition to mon from any other mode. Now, code operating in the mon mode can toggle the NS bit to access non-secure versions of the control registers, but you can't access the non-secure banked R13 (stack), banked R14 (link) and banked SPSR registers. Not without resorting to large-overhead hacks (more on this later). So if you wanted to use TrustZone non-secure mode for VMs, you wouldn't be (easily) able to switch those while scheduling. Of course when coupled with the physical address space protection issue, this issue might just nudge you towards porting, say, the Xen Hypervisor to run as the non-secure OS and taking it from there :-).
Using TrustZone to run Xen side-by-side with, say, a Linux, Symbian or NT has definite advantages - no need to modify the "host" OS other than loading a driver implementing a TrustZone monitor, and pretty transparent and fast switches to the Xen hypervisor and thus other OSes. This provides a solution for devices where replacing the bundled OS or booting a third-party kernel is not an option... which as far as ARM devices go nowadays, is almost all of them.
Subscribe to:
Posts (Atom)