Sunday, December 7, 2014

iQUIK supports the Performa 6400

After fixing some sad bugs from the last refactoring binge and adding support for OF 2.0, the 6400 (and likely all "Alchemy"-based Macs) can be booted via iQUIK.

Just like OpenFirmware 2.0.1, 2.0 seems to suffer from the "shallow setprop" bug, that results in bogus values for the /chosen/linux,initrd-start and /chosen/linux,initrd-end properties.

Friday, November 28, 2014

Using the Nexus 9 secure agent for debug logging


import fileinput, re, sys

# It turns out the "Trusty Secure OS" Crippleware on the Nexus 9 is
# good for least something. It is thankfully pretty chatty, meaning
# you can use it for logging from code where it's inconvenient
# or impossible to write to the UART directly, like MMU bringup code ;-).
# A sequence like:
#   mov x0, #'V'
#   smc #0xFFFF
# ...will result in the following getting emitted. I am guessing x1...x3
# get printed here as param0..2 but I am too lazy to check.
# smc_undefined:67: Undefined monitor call!
# smc_undefined:67: SMC: 0x56 (Stdcall entity 0 function 0x56)
# smc_undefined:67: param0: 0xf77c2e69
# smc_undefined:67: param1: 0xf77c2e68
# smc_undefined:67: param2: 0x0
# Now you can do basic logging to debug early bring-up. The following
# Python will turn your giant Minicom capture into something more
# sensible.

def process(line):
    m = re.match('\s*smc_undefined:67: SMC: (0x[0-9a-f]+)', line)
    if m:
        sys.stdout.write(chr(int(m.groups()[0], 16)))

for line in fileinput.input():


Sunday, November 23, 2014

64-bit ARM OS/Kernel/Systems Development Demo on a Nexus 9

64-bit ARM OS/Kernel/Systems Development on a Nexus 9

The Nexus 9 is based on a 64-bit nVidia K1 chip. At the moment it is the most affordable (price wise) and accessible (unit-wise) platform for exploring OS work on an AArch64 platform. The Nexus 9 allows performing an unlock via "fastboot oem unlock", allowing custom Android images to be booted.

What this is

This is a small demo, demonstrating how to build and boot arbitrary code on your Nexus 9 and do some basic I/O. The demo demonstrates serial I/O and draws two black diagonal lines on the framebuffer.

What you need - required

What you need - optional

How it works

HBOOT, the Nexus bootloader, expects images to be in a certain format. The booted kernel/code must:
  • Be 64-bit
  • Be binary (not ELF)
  • Be linked at 0x80080000
  • Be compressed using "gzip"
  • Be followed by the binary FDT
  • Be contained in an "ANDROID!" boot image.

Some notes:

  • The link address appears to be hardcoded in HBOOT. The Android boot image bases and the AArch64 kernel header fields appear to be ignored.
  • The boot image can contain an additional ramdisk/initrd/payload.
  • The FDT is patched by HBOOT to contain correct linux,initrd-start and linux,initrd-end addresses.

How to build

$ CROSS_COMPILE=aarch64-linux-gnu- make

How to boot

Connect your Android tablet via a USB cable. Optionally connect the UART headphone jack adapter to your computer. The settings are 115200 8-n-1.
$ adb reboot-bootloader
$ fastboot boot nexus9_demo

Actual output of the demo

CurrentEL = 0000000000000001
SCTLR_EL1 = 0000000010C5083A

Where to go from here

"nexus9_dts" is the decompiled "nexus9_dtb". "nexus9_dtb" was extracted from the Android boot.img.

Final thoughts

From studying the Tegra K1 TRM, the K1 should have virtualization support (i.e. EL2). However, the HTC firmware does not allow booting an EL2-enabled OS. All kernels are booted in EL1. This is rather unfortunate and prevents playing around with KVM and Xen on this platform. Perhaps there are some problems with EL2 support. Or perhaps HTC/nVidia/Google were too myopic to allow EL2 access. It's unclear if the "oem unlock" allows reflashing custom unsigned firmware. "nvtboot" seems to enforce signed "Trusted OS" payloads, at least from dumping the strings. The boot flow looks something like this:
  • "nvtboot" (32-bit) runs on the AVP/COP.
  • "nvtboot" loads "tos" (64-bit) (Trusty aka Secure OS) on the AArch64 chip.
  • "tos" loads HBOOT (32-bit).
  • HBOOT loads Android and implements the fastboot protocol.
It's unclear how to enter NVFlash/APX mode, or how helpful that would be.

Wednesday, June 25, 2014


Apparently back around 2006 there was an effort at Sun Labs to get OpenSolaris to work on CHRP(like)  PowerPC machines. And according to the documentation, the kernel could even boot to shell on a G4 Apple.

That effort was called Polaris. It was difficult to find the CDDL-licensed sources, but I've made them available for everyone else to play with at

I haven't tried it out or done anything with the sources yet. The Solaris kernel is a pretty amazing piece of software, and a very portable and well-designed one to boot. I am glad Sun open-sourced it before folding, as it's code like this that should be influencing OS R&D for generations to come. It would be interesting to see the Polaris code being used as a base for an AArch64 investigation...


Tuesday, June 24, 2014

What's special about...

addi r0,r1,0x138
ori r0,r0,0x60

...and I/O port 0x92 :-)?

Sunday, June 22, 2014

iQUIK update

I now have a 1.5Ghz PowerBook 12" in my possession to test iQUIK with. This is of course a NewWorld, and not a primary target for the iQUIK boot loader...

Couple of observations to be made:
  • OpenFirmware 3.0 doesn't support partition zero booting (i.e. hd:0 or CHRP-spec hd:%BOOT). This means that iQUIK cannot be booted the same way as it boots on OldWorlds,  but neither is it required. iQUIK can be booted on NewWorlds the same way as Yaboot, i.e. placing 'iquik.elf' on an HFS+ partition and blessing it. 
  • NewWorld OF requires appending ":0" for full-disk access to disk devices
I've also fixed a bug inside partition code that truncated offsets to 32 bits, and improved device path handling and parsing.

In short, though, it works. And it works quite well. So iQUIK now works on OldWorld and NewWorld machines. Yaboot - only on NewWorlds. Of course, Yaboot also supports CHRP machines, network booting and reads all filesystems supported by the underlying OpenFirmware implementation. So there's plenty of work to reach feature parity in that regard.


Tuesday, June 3, 2014

Detecting 'make' environment variables change

While playing with 'iquik' and trying to add a mode to build a reduced-logging version that is smaller, I ran into an interesting question - how do I force a rebuild of everything with a clean?
# Example of a Makefile that detects "environment change".
# I.e.:
# andreiw-lnx:~/src/ make clean
# Cleaning
# andreiw-lnx:~/src/ make 
# Resuming build with env ""
# Building with ""
# andreiw-lnx:~/src/ make CONFIG_EXAMPLE=1
# Cleaning due to env change (was "" now "-DCONFIG_EXAMPLE")
# Cleaning
# Building with "-DCONFIG_EXAMPLE"
# andreiw-lnx:~/src/ make CONFIG_EXAMPLE=1
# Resuming build with env "-DCONFIG_EXAMPLE"
# Building with "-DCONFIG_EXAMPLE"
# andreiw-lnx:~/src

-include $(ENV_FILE)

# Environment definition.
ifeq ($(CONFIG_EXAMPLE), 1)

# Detect environment change.

all: $(PRETARGET) target

 @echo Building with \"$(BUILD_FLAGS)\"

 @echo Resuming build with env \"$(BUILD_FLAGS)\"

 @echo Cleaning due to env change \(was \"$(OLD_BUILD_FLAGS)\" now \"$(BUILD_FLAGS)\"\)

clean_env: log_clean_env clean
 @rm -f $(ENV_FILE)
 @echo $(BUILD_ENV) > $(ENV_FILE)

 @echo Cleaning

Saturday, May 31, 2014

Musings on device workarounds and attribution

I was trading war stories with some colleagues today, and remembered the time I was chasing crazy UART bugs.

So I just had to go look at my battlefields of past and reminisce...

Ever look at a random driver and wonder how convoluted weird code gets written? Then you look at the git history and see - nothing useful. No history. It was apparently all written at once, by some crazy smart engineer based on thorough and clean specs, right ;-)?

Like the serial-tegra driver, for example. Ever wonder why UART waits for a certain bit of time after switching baud rate?

I used to work on the Moto Xoom tablet - the first official Android tablet, based around the Tegra 2 SoC. Once upon a time I was investigating a bug around suspend-resume. We were seeing a kernel crash when waking the tablet up occasionally with a paired Bluetooth keyboard. The actual crash was the result of a BlueZ bug that didn't defensively treat BT HCI connect events for devices that weren't disconnected (have a gander at - yes, a  rogue Bluetooth adapter /can/ crash the system, wonderful, right?)

But why weren't the BT disconnect messages coming through?

The tablet was asleep at the time of the disconnect, and the disconnect woke it up. The Bluetooth host was connected to the CPU via a UART, and the UART needed to be resumed before the BT host could send data. UART resume, among other things, needs to set the baud rate. What was happening, is that the the hardware flow control allowed RX before the baud rate change fully propagated through the UART block. The result is that the received data was corrupted. Oops.

Knowing what was happening didn't mean I had a solution, of course. The docs were useless, and it took another fun half a week to figure out the solution :-). Too bad I can't remember what this fix was for... Probably more BT issues :).

So what point did I want to make? The Tegra HSUART driver "got rewritten" when Tegra 2/3 support was upstreamed. But it's the same code, basically, even down to the code flow and comments. You put in time, sleepless nights and life energy and you can't get basic attribution from some unknown dude at NV.

Behind every line of code is some story. Some feeling of exhilaration, success and victory. I almost made a t-shirt with the fix :-). So always attribute contributions out of solidarity with your fellow hackers. Heh.

BlueZ is a train wreck, though... There. I said it.

Friday, May 30, 2014


The first step to getting MkLinux to run is to get the build tools to run.

Build tools?

The OSF Open Development Environment tools. Which have been very hard to find ( But now you can find them and even build them -

If I ever find time I'll clean up the code so it doesn't build with a million warnings.


Saturday, April 12, 2014

Inline assembler stupidity

I keep getting caught by this, because this is a perfect example of the compiler doing something contrary to what you're writing.
  asm volatile (
                "ldr %0, [%1]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%1]\n\t"
                : "=r" (tmp)
                : "r" (p)

Guess what this gets compiled to?
  30: f9400000  ldr x0, [x0]
  34: 91000400  add x0, x0, #0x1
  38: f9000000  str x0, [x0]

...equivalent to, of course,
  asm volatile (
                "ldr %0, [%0]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%0]\n\t"
                : "+r" (p)
The sort of aggressive and non-obvious optimization is crazy because if I really wanted the generated code, I'd have written the inline asm the second way with a read and write modifier. Maybe for architectures with specialized and very few registers this is a reasonable approach, but for RISCish instruction sets with large instruction files this is nuts. There should be a warning option for this nonsense.

This "correct way" is to use an earlyclobber modifier.
  asm volatile (
                "ldr %0, [%1]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%1]\n\t"
                : "=&r" (tmp)
                : "r" (p)
IMO anything that needs a separate paragraph in third-party documents as "a caveat" needs to be fixed.

Speaking of which... Given that C really is a high-level assembly, why not finally standardize on inline asm?

Wednesday, April 2, 2014

Exotic QEMU bugs and fixes

I found that the linux-user portion of QEMU has a few bugs around signals. Really, around handling "self-modifying" code and having the code generator step on unmapped memory.

The test is pretty simple.  Have a page of memory containing one instruction which will cause SIGILL to be delivered, followed by a 'ret'. On a SIGILL, unmap the page. On a SIGSEGV, map the page back in. I've two of these tests - one with actual mmap/munmap, and another with mprotect. The tests verify corner conditions in the binary translation logic, with back-to-back signals and an attempt to execute unmapped code.

"self-modifying" code sounds grand, but it's just the signal return path. While newer Linux kernels use VDSO symbols for the restorer (that's the part that does the sigreturn syscall), QEMU still creates an on-the-stack trampoline. When QEMU creates a translation block for the trampoline, it marks the page internally as read-only so that it can detect when the TB should be invalidated. It is this later logic which was short-circuiting and exiting earlier than needed.
That's fixed in

The second problem is that QEMU doesn't deal very well with being forced to run code that's unmapped. The TCG generator walks over the unmapped memory, gets a SIGSEGV, which attempts delivery of the signal to the translated program (which again, means getting and/or creating more TBs). The problem, though, is that we attempt to reacquire the tcg_ctx.tb_ctx.tb_lock, which we never dropped due to the signal. i.e. after a SIGSEGV here:
#0  disas_a64_insn (s=0x7fffffffdc40, env=<optimized out>) at /target-arm/translate-a64.c:8972
#1  gen_intermediate_code_internal_a64 (cpu=cpu@entry=0x62532200, tb=tb@entry=0x7ffff440b120, search_pc=search_pc@entry=false) at /target-arm/translate-a64.c:9097
#2  0x00000000600d76e5 in gen_intermediate_code_internal (search_pc=false, tb=0x7ffff440b120, cpu=0x62532200) at /target-arm/translate.c:10629
#3  gen_intermediate_code (env=env@entry=0x6253a468, tb=tb@entry=0x7ffff440b120) at /target-arm/translate.c:10904
#4  0x00000000600e4851 in cpu_arm_gen_code (env=env@entry=0x6253a468, tb=tb@entry=0x7ffff440b120, gen_code_size_ptr=gen_code_size_ptr@entry=0x7fffffffdd64) at /translate-all.c:159
#5  0x00000000600e5152 in tb_gen_code (cpu=cpu@entry=0x62532200, pc=pc@entry=4820992, cs_base=cs_base@entry=0, flags=<optimized out>, cflags=cflags@entry=0) at /translate-all.c:973
#6  0x0000000060040e7a in tb_find_slow (flags=<optimized out>, pc=4820992, env=0x6253a468, cs_base=<optimized out>) at /cpu-exec.c:162
#7  tb_find_fast (env=0x6253a468) at /cpu-exec.c:193
#8  cpu_arm_exec (env=env@entry=0x6253a468) at /cpu-exec.c:611
#9  0x000000006005ad2c in cpu_loop (env=env@entry=0x6253a468) at /linux-user/main.c:1015
#10 0x0000000060004dd1 in main (argc=1, argv=<optimized out>, envp=<optimized out>) at /linux-user/main.c:4392

We longjmp back to the CPU loop and deadlock here:
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x000000006012991d in _L_lock_858 ()
#2  0x000000006012978a in __pthread_mutex_lock (mutex=0x604ffa98 <tcg_ctx+350904>) at pthread_mutex_lock.c:61
#3  0x0000000060040bfd in cpu_arm_exec (env=env@entry=0x6253a228) at /cpu-exec.c:610
#4  0x000000006005ad2c in cpu_loop (env=env@entry=0x6253a228) at /linux-user/main.c:1015
#5  0x0000000060004dd1 in main (argc=1, argv=<optimized out>, envp=<optimized out>) at /linux-user/main.c:4392
The solution is to allow tb_gen_code to back out if it knows it can't read the memory. A new exception type is added, EXCP_TB_EFAULT, which then needs to be handled just like an address fault inside cpu_loop.

This makes the above tests pass on AArch64 and x86 (32-bit only, since there is no signal handling support for the x86_64-linux-user target at the moment).

Update: Fixes look like they're going in. The TCG deadlock is getting fixed in a simpler way. It is a better and more self-contained fix.

Tuesday, March 18, 2014

OpenBIOS and partition-zero booting, redux

As of r1280 OpenBIOS now correctly implements booting via partition-zero boot block.;a=commit;h=1ac3fb92c109f5545d373a0576b87750c53cce19

That's the power of Open Source.

Wednesday, March 5, 2014

Preboot scripts, redux.

The old implementation suffered from a couple problems:
  • Chain booting and separate boot blocks made installation fussy
  • I discovered that some OF versions don't quite claim more than a certain amount of memory for loading, making dual boot blocks waste a lot of precious RAM (I had two useless 4K stacks, for example). The symptom is a DSI during the actual load by the firmware - the same problem as seen trying to boot *BSD boot floppies on the 3400c. There might be a work around manually claiming the memory, but... heh.
I then realized I could have well added the preboot script at the end of the iquik.b block, and have the later be smart enough to eval the script prior to looking for the boot-file.

Usage is pretty simple (the -p option):
$ cat > hello.of 
$ cr cr cr ." Hello OF!!!" cr cr cr
$ iquik -p hello.of

Of course, this works only if booting iQUIK via partition zero (i.e. :0 or %BOOT). Presumably if you're running the ELF directly (hi CHRP!), you'll use a normal CHRP boot script.

Making OpenBIOS support Apple partition-zero booting

To speed up my development cycle on iQUIK and make it less painful I really needed to get OpenBIOS to boot via bootcode correctly. The current sources (r1272) hard code the load address for the legacy QUIK bootloader, which makes it useless for me or for anyone else (like the NetBSD or OpenBSD bootloaders).

I didn't try very hard, but the end result works, and hopefully the OpenBIOS guys will just take my fix.

SVN workflow seems so...senescent compared to git. Sigh.
Index: forth/debugging/client.fs
--- forth/debugging/client.fs (revision 1272)
+++ forth/debugging/client.fs (working copy)
@@ -28,7 +28,13 @@
 0 state-valid !
 variable want-bootcode
+variable bootcode-base
+variable bootcode-size
+variable bootcode-entry
 0 want-bootcode !
+0 bootcode-base !
+0 bootcode-size !
+0 bootcode-entry !
 variable file-size
Index: libopenbios/bootcode_load.c
--- libopenbios/bootcode_load.c (revision 1272)
+++ libopenbios/bootcode_load.c (working copy)
@@ -12,13 +12,11 @@
 #define printf printk
 #define debug printk
 bootcode_load(ihandle_t dev)
     int retval = -1, count = 0, fd;
-    unsigned long bootcode, loadbase, offset;
+    unsigned long bootcode, loadbase, offset, loadsize, entry;
     /* Mark the saved-program-state as invalid */
     feval("0 state-valid !");
@@ -33,34 +31,59 @@
     loadbase = POP();
 #ifdef CONFIG_PPC
-    /* ...except that QUIK (the only known user of %BOOT to date) is built
-       with a hard-coded address of 0x3f4000. Let's just use this for the
-       moment on both New World and Old World Macs, allowing QUIK to also
-       work under a New World Mac. If we find another user of %BOOT we can
-       rethink this later. PReP machines should be left unaffected. */
+    /*
+     * Apple OF does not honor load-base and instead uses pmBootLoad
+     * value from the boot partition descriptor.
+     *
+     * Tested with:
+     *   a debian image with QUIK installed
+     *   a debian image with iQUIK installed (
+     *   an IQUIK boot floppy
+     *   a NetBSD boot floppy (boots stage 2)
+     */
     if (is_apple()) {
+      feval("bootcode-base @");
+      loadbase = POP();
+      feval("bootcode-size @");
+      loadsize = POP();
+      feval("bootcode-entry @");
+      entry = POP();
+      printk("bootcode base 0x%lx, size 0x%lx, entry 0x%lx\n",
+             loadbase, loadsize, entry);
+    } else {
+      entry = loadbase;
+      /* Load as much as we can. */
+      loadsize = 0;
     bootcode = loadbase;
     offset = 0;
-    while(1) {
+    if (loadsize) {
+      if (seek_io(fd, offset) != -1)
+        count = read_io(fd, (void *) bootcode, loadsize);
+    } else {
+      while(1) {
         if (seek_io(fd, offset) == -1)
-            break;
+          break;
         count = read_io(fd, (void *)bootcode, 512);
         offset += count;
         bootcode += count;
+      }
     /* If we didn't read anything then exit */
     if (!count) {
         goto out;
+    printk("entry = 0x%lx\n", entry);
     /* Initialise saved-program-state */
-    PUSH(loadbase);
+    PUSH(entry);
     feval("saved-program-state >sps.entry !");
     feval("saved-program-state >sps.file-size !");
Index: libopenbios/load.c
--- libopenbios/load.c (revision 1272)
+++ libopenbios/load.c (working copy)
@@ -1,6 +1,6 @@
  *   Creation Date: <2010/06/25 20:00:00 mcayland>
- *   Time-stamp: <2010/06/25 20:00:00 mcayland>
+ *   Time-stamp: <2014-03-05 03:18:49 andreiw>
  * <load.c>
Index: packages/mac-parts.c
--- packages/mac-parts.c (revision 1272)
+++ packages/mac-parts.c (working copy)
@@ -1,6 +1,6 @@
  *   Creation Date: <2003/12/04 17:07:05 samuel>
- *   Time-stamp: <2004/01/07 19:36:09 samuel>
+ *   Time-stamp: <2014-03-05 03:42:07 andreiw>
  * <mac-parts.c>
@@ -237,6 +237,19 @@
      size = (long long)__be32_to_cpu(par.pmPartBlkCnt) * bs; 
      if (want_bootcode) {
+  ucell loadaddr = 0;
+  ucell loadsize = 0;
+  ucell loadentry = 0;
+  loadaddr = __be32_to_cpu(par.pmBootLoad);
+  loadsize = __be32_to_cpu(par.pmBootSize);
+  loadentry = __be32_to_cpu(par.pmBootEntry);
+  PUSH(loadaddr);
+  feval("bootcode-base !");
+  PUSH(loadsize);
+  feval("bootcode-size !");
+  PUSH(loadentry);
+  feval("bootcode-entry !");
   offs += (long long)__be32_to_cpu(par.pmLgBootStart) * bs;
   size = (long long)__be32_to_cpu(par.pmBootSize);
@@ -249,6 +262,10 @@
      di->size_hi = size >> BITS;
      di->size_lo = size & (ucell) -1;
+     if (want_bootcode) {
+       goto out;
+     }
      /* We have a valid partition - so probe for a filesystem at the current offset */
      DPRINTF("mac-parts: about to probe for fs\n");
      DPUSH( offs );
@@ -277,7 +294,7 @@
       /* If we have been asked to open a particular file, interpose the filesystem package with 
       the passed filename as an argument */
-      if (!want_bootcode && strlen(argstr)) {
+      if ( strlen(argstr)) {
        push_str( argstr );
        PUSH_ph( ph );
@@ -286,6 +303,10 @@
       goto out;
      } else {
       DPRINTF("mac-parts: no filesystem found on partition %d; bypassing misc-files interpose\n", parnum);
+      /* Fail out instead of having macparts_load get called uselessly, allowing trying the next
+         boot device */
+      ret = 0;
This does make booting iQUIK on OpenBIOS now very easy.
qemu-system-ppc -hda ~/src/quik/distrib/floppy-cfg.img -prom-env "boot-file=hd:3"

Getting PowerPC OpenBIOS to run on QEMU

  • You've apt-get installed qemu, but qemu-system-ppc boots to a blank (white or black) screen?
  • You've pulled the OpenBIOS SVN, built qemu-openbios.elf, but it boots to a blank screen?
On serial output, you might see "<< set_property: NULL phandle" messages, and the CPU is stuck in a perpetual ISI.

Have no fear. Apparently GCC versions > 4.6 miscompile OpenBIOS, so you need to disable optimization. This is presently set to "-Os" under "". Setting it t "-O0" should do.

I'll probably investigate this deeper after fixing partition-zero booting...