Sunday, July 5, 2015

Implementing ;CODE in AArch32 Jonesforth for real

The Jonesforth ;CODE definition is unfortunately little more than a curiosity. After all, if you wanted to write a native machine word, you'd probably follow along and implement it inside jonesforth.s proper using the defcode macro. The real power of ;CODE would be in coupling with the CREATE word, letting you have words that define other words.

I.e. we want to be able to do something like:
    defword "$DOCON",F_IMM,ASMDOCON
        .int LIT            @ r0 points to DFA
        ldr r12, [r0, #4]   @ read cell from DFA
        .int COMMA
        .int LIT
        PUSHDSP r12         @ push to stack
        .int COMMA
        .int EXIT
    
    : MKCON
       WORD CREATE
       0 ,        ( push dummy codeword, rewritten by (;CODE) )
       ,          ( actual constant )
    ;CODE
       $DOCON
    END-CODE
    
    5 MKCON CON5  ( create word CON5 that will push 5 on stack )
    CON5 . CR     ( prints 5 )
So ;CODE is the variant to be used with CREATE, while the plain ol' make-me-a-native-word variant is called CODE. And both get to be matched with END-CODE, not semicolon. At least according to F83 or something. We're not trying to stick to any Forth standard, but the definitions have to be useful...right? So the ;CODE business now looks a bit different:
    \ This used to look like : FOO ;CODE
    CODE FOO CODE-END

    @ push r0 to stack
    defword "$<R0",F_IMM,ASMFROMR0
        .int LIT
        PUSHDSP r0
        .int COMMA
        .int EXIT
    
    @ push r7 to stack
    defword "$<R7",F_IMM,ASMFROMR7
        .int LIT
        PUSHDSP r7
        .int COMMA
        .int EXIT
    
    @ pop stack to r0
    defword "$>R0",F_IMM,ASMTOR0
        .int LIT
        POPDSP r0
        .int COMMA
        .int EXIT
    @ pop stack to r7
    defword "$>R7",F_IMM,ASMTOR7
        .int LIT
        POPDSP r7
        .int COMMA
        .int EXIT
    
    CODE SWAP $>R0 $>R7 $<R0 $<R7 END-CODE
    
    HEX 1337 FOOF SWAP . . ( prints 1337 FOOF )
So now for the actual definitions. It /looks/ pretty tame...but it took me a week to wrap my mind around it.
: (;CODE) R> LATEST @ >CFA ! ;
: ;CODE IMMEDIATE ' (;CODE) , ;
: (CODE) HERE @ LATEST @ >CFA ! ;
: CODE : (CODE) ;
: (END-CODE-INT) LATEST @ HIDDEN [COMPILE] [ ;
: (END-CODE) IMMEDIATE (END-CODE-INT) ;
: END-CODE IMMEDIATE [COMPILE] $NEXT (END-CODE-INT) ;
HIDE (END-CODE-INT)
HIDE (CODE)
Most interesting here is the behavior of ;CODE. Let's examine the example I gave first. It's an IMMEDIATE word that will compile (;CODE) into MKCON, followed by the machine code placed by generators like $NEXT or $DOCON. When MKCON is executed, it will then update the CFA of CON5 to point to the machine words inside MKCON that followed (;CODE), instead of DOCOL. The address of machine words of course is on the return stack since it's the first word following (;CODE). Aaaaand because we pop the return address, we end up EXITing not to MKCON from (;CODE) but to its caller, thereby not crashing on the crazy machine code placed by $DOCON.

Fun. Hope that made sense. I had to meditate for a while over Brad Rodriguez' Moving Forth 3 (http://www.bradrodriguez.com/papers/moving3.htm) article before it made any sense to me. But like all ingenious beautiful things, it ends up being dead simple.

No comments:

Post a Comment