Wednesday, June 25, 2014


Apparently back around 2006 there was an effort at Sun Labs to get OpenSolaris to work on CHRP(like)  PowerPC machines. And according to the documentation, the kernel could even boot to shell on a G4 Apple.

That effort was called Polaris. It was difficult to find the CDDL-licensed sources, but I've made them available for everyone else to play with at

I haven't tried it out or done anything with the sources yet. The Solaris kernel is a pretty amazing piece of software, and a very portable and well-designed one to boot. I am glad Sun open-sourced it before folding, as it's code like this that should be influencing OS R&D for generations to come. It would be interesting to see the Polaris code being used as a base for an AArch64 investigation...


Tuesday, June 24, 2014

What's special about...

addi r0,r1,0x138
ori r0,r0,0x60

...and I/O port 0x92 :-)?

Sunday, June 22, 2014

iQUIK update

I now have a 1.5Ghz PowerBook 12" in my possession to test iQUIK with. This is of course a NewWorld, and not a primary target for the iQUIK boot loader...

Couple of observations to be made:
  • OpenFirmware 3.0 doesn't support partition zero booting (i.e. hd:0 or CHRP-spec hd:%BOOT). This means that iQUIK cannot be booted the same way as it boots on OldWorlds,  but neither is it required. iQUIK can be booted on NewWorlds the same way as Yaboot, i.e. placing 'iquik.elf' on an HFS+ partition and blessing it. 
  • NewWorld OF requires appending ":0" for full-disk access to disk devices
I've also fixed a bug inside partition code that truncated offsets to 32 bits, and improved device path handling and parsing.

In short, though, it works. And it works quite well. So iQUIK now works on OldWorld and NewWorld machines. Yaboot - only on NewWorlds. Of course, Yaboot also supports CHRP machines, network booting and reads all filesystems supported by the underlying OpenFirmware implementation. So there's plenty of work to reach feature parity in that regard.


Tuesday, June 3, 2014

Detecting 'make' environment variables change

While playing with 'iquik' and trying to add a mode to build a reduced-logging version that is smaller, I ran into an interesting question - how do I force a rebuild of everything with a clean?
# Example of a Makefile that detects "environment change".
# I.e.:
# andreiw-lnx:~/src/ make clean
# Cleaning
# andreiw-lnx:~/src/ make 
# Resuming build with env ""
# Building with ""
# andreiw-lnx:~/src/ make CONFIG_EXAMPLE=1
# Cleaning due to env change (was "" now "-DCONFIG_EXAMPLE")
# Cleaning
# Building with "-DCONFIG_EXAMPLE"
# andreiw-lnx:~/src/ make CONFIG_EXAMPLE=1
# Resuming build with env "-DCONFIG_EXAMPLE"
# Building with "-DCONFIG_EXAMPLE"
# andreiw-lnx:~/src

-include $(ENV_FILE)

# Environment definition.
ifeq ($(CONFIG_EXAMPLE), 1)

# Detect environment change.

all: $(PRETARGET) target

 @echo Building with \"$(BUILD_FLAGS)\"

 @echo Resuming build with env \"$(BUILD_FLAGS)\"

 @echo Cleaning due to env change \(was \"$(OLD_BUILD_FLAGS)\" now \"$(BUILD_FLAGS)\"\)

clean_env: log_clean_env clean
 @rm -f $(ENV_FILE)
 @echo $(BUILD_ENV) > $(ENV_FILE)

 @echo Cleaning

Saturday, May 31, 2014

Musings on device workarounds and attribution

I was trading war stories with some colleagues today, and remembered the time I was chasing crazy UART bugs.

So I just had to go look at my battlefields of past and reminisce...

Ever look at a random driver and wonder how convoluted weird code gets written? Then you look at the git history and see - nothing useful. No history. It was apparently all written at once, by some crazy smart engineer based on thorough and clean specs, right ;-)?

Like the serial-tegra driver, for example. Ever wonder why UART waits for a certain bit of time after switching baud rate?

I used to work on the Moto Xoom tablet - the first official Android tablet, based around the Tegra 2 SoC. Once upon a time I was investigating a bug around suspend-resume. We were seeing a kernel crash when waking the tablet up occasionally with a paired Bluetooth keyboard. The actual crash was the result of a BlueZ bug that didn't defensively treat BT HCI connect events for devices that weren't disconnected (have a gander at - yes, a  rogue Bluetooth adapter /can/ crash the system, wonderful, right?)

But why weren't the BT disconnect messages coming through?

The tablet was asleep at the time of the disconnect, and the disconnect woke it up. The Bluetooth host was connected to the CPU via a UART, and the UART needed to be resumed before the BT host could send data. UART resume, among other things, needs to set the baud rate. What was happening, is that the the hardware flow control allowed RX before the baud rate change fully propagated through the UART block. The result is that the received data was corrupted. Oops.

Knowing what was happening didn't mean I had a solution, of course. The docs were useless, and it took another fun half a week to figure out the solution :-). Too bad I can't remember what this fix was for... Probably more BT issues :).

So what point did I want to make? The Tegra HSUART driver "got rewritten" when Tegra 2/3 support was upstreamed. But it's the same code, basically, even down to the code flow and comments. You put in time, sleepless nights and life energy and you can't get basic attribution from some unknown dude at NV.

Behind every line of code is some story. Some feeling of exhilaration, success and victory. I almost made a t-shirt with the fix :-). So always attribute contributions out of solidarity with your fellow hackers. Heh.

BlueZ is a train wreck, though... There. I said it.

Friday, May 30, 2014


The first step to getting MkLinux to run is to get the build tools to run.

Build tools?

The OSF Open Development Environment tools. Which have been very hard to find ( But now you can find them and even build them -

If I ever find time I'll clean up the code so it doesn't build with a million warnings.


Saturday, April 12, 2014

Inline assembler stupidity

I keep getting caught by this, because this is a perfect example of the compiler doing something contrary to what you're writing.
  asm volatile (
                "ldr %0, [%1]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%1]\n\t"
                : "=r" (tmp)
                : "r" (p)

Guess what this gets compiled to?
  30: f9400000  ldr x0, [x0]
  34: 91000400  add x0, x0, #0x1
  38: f9000000  str x0, [x0]

...equivalent to, of course,
  asm volatile (
                "ldr %0, [%0]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%0]\n\t"
                : "+r" (p)
The sort of aggressive and non-obvious optimization is crazy because if I really wanted the generated code, I'd have written the inline asm the second way with a read and write modifier. Maybe for architectures with specialized and very few registers this is a reasonable approach, but for RISCish instruction sets with large instruction files this is nuts. There should be a warning option for this nonsense.

This "correct way" is to use an earlyclobber modifier.
  asm volatile (
                "ldr %0, [%1]\n\t"
                "add %0, %0, #1\n\t"
                "str %0, [%1]\n\t"
                : "=&r" (tmp)
                : "r" (p)
IMO anything that needs a separate paragraph in third-party documents as "a caveat" needs to be fixed.

Speaking of which... Given that C really is a high-level assembly, why not finally standardize on inline asm?