Tuesday, July 4, 2017

Porting UEFI to XXX, step 1

I've decided to do the actual blogging for this project *in* the repo itself. See https://github.com/andreiw/ppcnw-edk2/blob/master/README.md. After all, markdown is convenient enough and using Blogger on the G4 is p-a-i-n-f-u-l.

So it turns out that blogging about something after the fact is pretty tough. I really wanted to blog about my PoC port of UEFI to the OpenPower ecosystem, but it's incredibly difficult to go back and try to systematize something that's been a few years back.

So let's try this again. This time, our victim will be a G4 12" PowerBook6,8 with a 7447A. That's a 32-bit PowerPC. Now, I'll go in small steps and document *everything*. For added fun, we'll begin porting on the target itself, at least until that gets too tedious.

First, I updated to the latest (and last) Debian 8 (Jessie).

Now let's clone the tree.

$ git clone https://github.com/tianocore/edk2

Setup the UEFI environment.

$ cd edk2
$ . edksetup.sh

Now we need to get the BaseTools building.

pbg4:~/src/edk2/BaseTools/ make
make -C Source/C
make[1]: Entering directory '/home/andreiw/src/edk2/BaseTools/Source/C'
Attempting to detect ARCH from 'uname -m': ppc
Could not detected ARCH from uname results
GNUmakefile:36: *** ARCH is not defined!.  Stop.
make[1]: Leaving directory '/home/andreiw/src/edk2/BaseTools/Source/C'
GNUmakefile:25: recipe for target 'Source/C' failed
make: *** [Source/C] Error 2

Ok. Let's fix that. We'll first need a Source/C/Include/PPC/ProcessorBind.h file.

ProcessorBind.h I've derived from another 32-bit CPU, like IA32 or ARM. This contains type definitions, mostly. It's boilerplate. In case there are multiple coding conventions for your architectures and it's not obvious which one you should be using, you might wish to specify what the EFIAPI attribute will be. Like, on x86 Windows-style cdecl is used, regardless of how you build the rest of Tiano. On most architectures an empty define is fine.

Now appropriately hook it into Source/C/Makefiles/header.makefile.

--- a/BaseTools/Source/C/Makefiles/header.makefile
+++ b/BaseTools/Source/C/Makefiles/header.makefile
@@ -43,6 +43,10 @@ ifeq ($(ARCH), AARCH64)
 ARCH_INCLUDE = -I $(MAKEROOT)/Include/AArch64/
 endif
+ifeq ($(ARCH), PPC)
+ARCH_INCLUDE = -I $(MAKEROOT)/Include/PPC/
+endif

Fix the ARCH detection in Source/C/GNUmakefile.

--- a/BaseTools/Source/C/GNUmakefile
+++ b/BaseTools/Source/C/GNUmakefile
@@ -31,6 +31,9 @@ ifndef ARCH
   ifneq (,$(findstring arm,$(uname_m)))
     ARCH=ARM
   endif
  ifneq (,$(findstring ppc,$(uname_m)))
    ARCH=PPC
  endif

Ok, ensure you have the libuuid headers (Debian uuid-dev) and g++. And...

You are done. This gives you the tools need to help build UEFI. Now we need to teach the build system about PowerPC...

Wednesday, June 8, 2016

Disassembling NT system files

Most NT files are stripped. This means that trying to disassemble them is a bit annoying because there are no symbols available. Checked builds of NT came with the symbol files (e.g. support/debug/ppc/symbols/exe/ntoskrnl.dbg for ntoskrnl.exe), but tools like Microsoft's dumpbin or OpenWatcom's wdis don't use them.

Now there's https://github.com/andreiw/dbgsplice to add the COFF symbol table back!


Sadly, the OpenWatcom analogue is quite buggy, so it's hard to suggest. It was a capable disassembler around setupldr and veneer.exe, but it gets horribly confused with complicated section layouts.

Of course the DBG files contain quite a bit more info (and we can do a lot more with the aux COFF syms too for annotating code than dumpbin suggests).

Sunday, May 8, 2016

Easy creation of proxy DLL pragmas

Converting dumpbin DLL exports information to MSVC linker pragmas

Yes, a bit of a weird request. But imagine you want to create a dummy DLL that forwards all the existing symbols of another DLL. Of course you're not going to do it by hand.

You have dumpbin output that looks like:

And you want:

I'm not an awk expert, but this works, except the dumpbin I ran on WIndows and awk I ran on OS X, hahah. But you get the gist...
dumpbin /exports C:\winnt\system32\ntdll.dll |
awk 'NR > 19 && $3 != "" { printf "#pragma comment(linker, \"/export:%s=ntdll.%s\")\n", $3, $3 }'
Might have to tweak number of lines to skip, depending on your tools. I'm on MSVC 4.0 (hello '90s!).

64-bit ARM OS/Kernel/Systems Development Demo on an nVidia Shield TV (Tegra X1)

64-bit ARM OS/Kernel/Systems Development on an nVidia Shield TV (Tegra X1)

The Shield TV is based on the 64-bit nVidia X1 chip. Unlike the K1, this is actually a Cortex-A57 based design, instead of being based on the nVidia "Denver" design. That by itself is kind of interesting already. The Shield TV was available much much earlier than the X1-based nVidia development board (Jetson TX1, you can even buy it on Amazon), and costs about a third of the TX1. The Shield TV allows performing an unlock via "fastboot oem unlock", allowing custom OS images to be booted. Unlike the TX1, you don't get a UART (and I haven't found the UART pads yet, either).

What this is

https://github.com/andreiw/shieldTV_demo

This is a small demo, demonstrating how to build and boot arbitrary code on your Tegra Shield TV. Unlike the previous Tegra K1 demo, you get EL2 (hypervisor mode!).

  • A Shield TV, unlocked. Search Youtube for walkthroughs.
  • Shield OS version >= 1.3.
  • GNU Make.
  • An AArch64 GNU toolchain.
  • ADB/Fastboot tools.
  • Bootimg tools (https://github.com/pbatard/bootimg-tools), built and somewhere in your path.
  • An HDMI-capable screen. Note, HDMI, not DVI-HDMI adapter. You want the firmware to configure the screen into 1920x1080 mode, otherwise you'll be in 640x480 and we don't want that...

How to build

$ CROSS_COMPILE=aarch64-linux-gnu- make
...should yield 'shieldTV_demo'.

How to boot

  1. Connect the Shield TV a USB cable to your dev workstation.
  2. Reboot device via:
    $ adb reboot-bootloader
    ...you should now see the nVidia splash screen, followed by the boot menu.
  3. If OS is 1.3, you can simply:
    $ fastboot boot shieldTV_demo
  4. If OS is 1.4 or 2.1, you will need to:
    $ fastboot flash recovery shieldTV_demo
    ...and then boot the "recovery kernel" by following instructions on screen.
The code will now start. You will see text and some drawn diagonal lines black background. The text should say we're at EL2 and the lines should be green. The drawing will be slow - the MMU is off and the caches are thus disabled.

Let me know if it's interesting to see the MMU setup code.

Final thoughts

The Shield TV is a better deal than the TX1 for the average hobbyist, even with the missing UART. For the price being sold the TX1 should come with a decent amount of RAM, not 1GB more than the Shield TV. nVidia...are you listening? Uncripple your firmware so booting custom images is not a song-and-dance (you broke it in 1.4!) and at least TELL us where the UART pads are on the motherboard. If you're really cool put together an "official" Ubuntu image that runs on the TX1 and the Shield (and fix SCR_EL3.HCE, too).

Saturday, May 7, 2016

Porting TianoCore to a new architecture

"UEFI" on...?

This article is the first in a series of posts touching on the general process of bringing up TianoCore EDK2 on an otherwise unsupported architecture. Maybe you want to support UEFI for your CPU architecture, or simply have a reasonable firmware environment.  In either case, because UEFI is not actually defined for your architecture, you're going to have to do a bit more work than your typical platform bring-up. By the time you're done, you could become a perfect addition to the UEFI forum and its specification committees...yeah!

This blog post and the ones following it continually refer to the ongoing PPC64LE Tiano port I am working on, available at https://github.com/andreiw/ppc64le-edk2/. Since everyone can read the fine code, this document mostly highlights the various steps performed throughout the commits. The git repo isn't perfect, though. Some changes ended up evolving over a few commits while I ironed things out and brought in more code. Hopefully I don't miss mentioning anything important.


TianoCore

Tiano is Intel's open-source (BSD-licensed) implementation of the UEFI specification. EDK2 is the second and current iteration of the implementation.

UEFI officially is supported on IA32, X64, IPF, ARM and AARCH64 architectures, the EDK2 has CPU support code for the x86 and ARM variants. There's a MIPS EDK1 port floating about, and now two PPC64LE OpenPower EDK2 ports, one of which ended up fueling this article...

I'm not going to focus on the architecture of either UEFI or Tiano. Good books have been written on the subject. Here's some tl;dr material, though, for the hopelessly impatient:
At least read the User Documentation and glance at the boot flow diagram. You should be now able to fetch, build and boot Tiano Core using the emulation package and have a rough understanding of what it takes to get a build going via Conf/target.txt and Conf/tools_def.txt.

Your Target

Your target is a 32-bit or 64-bit little-endian chip. I suppose big-endian is doable, but none of the Tiano code is endian safe and UEFI is strictly little-endian.

Development Environment

This assumes that you are doing development on Linux and are using and ELF toolchain and GCC compilers. You are going to need:

Basic Project Setup

Pick a short identifier for your architecture. Pick a name that's unused - for the Power8 port I picked PPC64. This tag will be used with the Tiano build scripts. The next step is to create a couple of Pkg directories that will contain our stuff. In my PPC64 port I initially went with a single-package solution, but I should be working on splitting it up into the more conventional layout, where platform-independent portions are in PPC64Pkg and platform-dependent parts (including build scripts for building for actual boards) are in PPC64PlatformPkg. You can refer to this commit as an example for the minimum required to build a dummy EFI application that does nothing.

If our architecture was supported, then running a similar build command for your package would succeed.
build -p YourArchPlatformPkg/YourArchPlatformPkg.dsc

Build Infrastructure

The next step is to enable the build infrastructure and scripts to understand your architecture identifier. Here's a list of files I had to modify - this is all stuff under BaseTools/Source/Python and the changes are incredibly mechanical.
  • Common/DataType.py
  • Common/EdkIIWorkspaceBuild.py
  • Common/FdfParserLite.py
  • Common/MigrationUtilities.py
  • CommonDataClass/CommonClass.py
  • CommonDataClass/ModuleClass.py
  • CommonDataClass/PlatformClass.py
  • GenFds/FdfParser.py
  • GenFds/FfsInfStatement.py
  • GenFds/GenFds.py
  • TargetTool/TargetTool.py
  • build/build.py

Build tools

UEFI executables are PE/COFF files. Since we are building on Linux, EDK2 uses a workflow where ELF artifacts produced by the cross-compiler are converted into PE32/PE32+ files with the GenFw tool. The PE/COFF artifacts are then wrapped into an FFS object and assembled into what is known as an FV ("firmware volume"). Multiple FVs are put into an FD.  The FV is really a flat file system that uses GUIDs for everything and can store other types of objects as well. You could also generate what is known as a TE (terse executable), but it's basically a cut down version of COFF FWIW.

Tiano deals with several kinds of executables. The UEFI runtime (DXE core, UEFI drivers, and so on) are relocated as they are loaded, while the code that runs prior to the UEFI runtime itself is XIP (execute-in-place) and is thus pre-relocated to fixed addresses by the tool constructing the FV. The point behind this being that such pre-UEFI code (which is the SEC and PEI phases for Tiano), is run in an environment before the DRAM is available.

Thus we need to enable GenFw to create PE/COFF executables from ELF for our architecture. This ties into the compiler options we're going to use, which is not something we've addressed yet. It helps to understand that a PE/COFF file is basically a position-independent executable. Although there is "preferred linking address" that all symbols are relocated against, the PE/COFF image contains a sufficient amount of relocation information, known as "base relocations", to allow loading at any address. So it would appear, that the easiest approach is to generate a position-independent ET_DYN ELF executable (with the -pie flag to the linker) and then focus on converting the output to COFF. This is the approach I highly suggest adopting. You will only have to deal with a single relocation type (R_PPC64_RELATIVE in my case) that will map naturally to either the 64-bit or the 32-bit COFF base relocation type, depending on the bit width of your architecture.

Other approaches are possible, such as embedding all relocations with the --emit-relocs linker flag and dealing with the entire soup of crazy relocs later, but the success of this approach is highly dependent on the architecture and ABI. It may be impossible to convert to PE/COFF due to a mismatch between ELF and COFF base relocs and ABI issues. When first working on the PPC64 port, I first followed the AArch64 approach that did just this, and ended up being forced to use the older (and not really meant for LE) ELFv1 ABI. Don't do it.

Note that depending on the ABI, you may have to do a bit of tool work. I am guessing this was the reason why the AArch64 port never adopted using PIE ELF binaries. Fortunately, you should be able to follow along my changes to Elf64Convert.c. You may also have to make changes to the base linker script used with the GNU toolchains.

Don't forget that you will need to manually rebuild the BaseTools if you make any changes!
make -C BaseTools

At this point we can go back and figure out the compile options. This is the BaseTools/Conf/tools_def.template file, that is then copied to Conf/ by edksetup.sh on freshly checked-out trees. The compiler options heavily depend on your architecture, of course, but generally speaking:
  • build on top of definitions made for new architectures like AArch64, because there's simply less of them in this file to wrap your mind around
  • consider the PPC64 definitions in my tree
  • -pie, unless position-independent executables don't work for you for some reason
  • large model
  • soft float (you can always move to hard float later if that rocks your boat, but it's just more CPU state to wrap your head around)
  • PECOFF_HEADER_SIZE=0x228 for 64-bit chips, 0x220 for 32-bit ones
This is the point where trying to build again should start giving you compile errors, because we still haven't modified any of the Tiano include and library files to be aware of the new architecture.

To be continued.

Thursday, October 1, 2015

Toying around with LE PowerPC64 via the PowerNV QEMU

I've validated that my ppc64le_hello example runs on top of BenH's PowerNV QEMU tree. Runs really snappy!

The only thing that doesn't work is mixed page-size segment support (MPSS, like 16MB in a 4K segment). QEMU does not support MPSS at the moment. Also, QEMU does not implement any of the IBM simulator's crazy Mambo calls.

Monday, July 13, 2015

Toying around with LE PowerPC64 via the Power8 simulator

ppc64le_hello is simple example of what it takes to write stand-alone (that is, system or OS) code that runs in Little-Endian and Hypervisor modes on the latest OpenPOWER/Power8 chips. Of course, I don't have a spare $3k to get one of these nice Tyan reference systems, but IBM does have a free, albeit glacially slow and non-OSS, POWER8 Functional Simulator.

What you get is a simple payload you can boot via skiboot, or another OPAL-compatible firmware. Features, in no particular order:
  • 64-bit real-mode HV LE operation.
  • logging via sim inteface (mambo_write).
  • logging via OPAL firmware (opal_write).
  • calling C code, stack/BSS/linkage setup/TOC.
  • calling BE code from LE.
  • FDT parsing, dumping FDT.
  • Taking and returning from exceptions, handling unrecoverable/nested exceptions.
  • Timebase (i.e. the "timestamp counter"), decrementer and hypervisor decrementer manipulation with some basic timer support (done for periodic callbacks into OPAL).
  • Running at HV alias addresses (loaded at 0x00000000200XXXXX, linked at 0x80000000200XXXXX). The idea being that the code will access physical RAM and its own data structures solely using the HV addresses.
  • SLB setup: demonstrates 1T segments with 4K base page and 16M base page size. One segment (slot = 0) is used  to back the HV alias addresses with 16M pages. Another  segment maps EA to VA 1:1 using 4K pages.
  • Very basic HTAB setup. Mapping and unmapping for pages in the 4K and 16M segments, supporting MPSS (16M pages in the 4K segment). No secondary PTEG. No eviction support. Not SMP safe. Any access within the HV alias addresses get mapped in. Any faults to other  unmapped locations are crashes, as addresses below 0x8000000000000000 should only be explicit maps.
  • Taking exception vectors with MMU on at the alternate vector location (AIL) 0xc000000000004000.
  • Running unpriviledged code.
See README for more information, including how to build and run. At some point it ran on a real Power8 machine - and may run still ;-).