Jonesforth 47 quoth:
NOTES ----------------------------------------------------------------------
DOES> isn't possible to implement with this FORTH because we don't have a separate
data pointer.
Thankfully, that's not true. The following is a tad AArch32-specific, given that I am playing with pijFORTHos (
https://github.com/organix/pijFORTHos), but the principle remains the same. Let's first look at how DOES> gets used.
: MKCON WORD CREATE 0 , , DOES> @ ;
This creates a word MKCON, that when invoked like:
1337 MKCON PUSH1337
...creates a new word PUSH1337 that will behave, as if it were defined as:
: PUSH1337 1337 ;
Recall the 
CREATE...;CODE example. DOES> is very similar to ;CODE, except you want Forth words, not native machine words invoked. In ;CODE, the native machine words are embedded in the word using CREATE...;CODE, and in CREATE...DOES> it will be Forth words instead. So if we had no DOES> word, we could write something like:
: MKCON WORD CREATE 0 , , ;CODE $DODOES @ ;
...where $DODOES is the machine code generator word that creates the magic we've yet to figure out. $DODOES needs to behave like a mix between DOCOL and NEXT, that is adjusting FIP (the indirect threaded code instruction pointer, pointing to the next word to execute) to point past $DODOES to the @ word. The DFA of the CREATEd word (i.e. PUSH1337) is put on the stack, so @ can read the constant (1337) out. This means the simplest CREATE...DOES> example is:
: DUMMY WORD CREATE 0 , DOES> DROP ;
DUMMY ADUMMY
...because we need to clean up the DFA for ADUMMY that is pushed on its invocation. Anyway, we could thus define DOES> like:
: DOES> IMMEDIATE ' (;CODE) , [COMPILE] $DODOES ;
Let's look at two ways of implementing $DODOES. Way 1 - fully inline. The address of the Forth words (the new FIP) is calculated by skipping past the bits emitted by $DODOES.
        .macro COMPILE_INSN, insn:vararg
        .int LIT
        \insn
        .int COMMA
        .endm
        .macro NEXT_BODY, wrap_insn:vararg=
        \wrap_insn ldr r0, [FIP], #4
        \wrap_insn ldr r1, [r0]
        \wrap_insn bx  r1
        .endm
@
@ A CREATE...DOES> word is basically a special CREATE...;CODE
@ word, where the forth words follow $DODOES. $DODOES thus
@ adjusts FIP to point right past $DODOES and does NEXT.
@
@ You can think of this as a special DOCOL that sets FIP to a
@ certain offset into the CREATE...DOES> word's DFA. This
@ version is embedded into the DFA so finding FIP is
@ as easy as moving FIP past itself.
@
@ - Just like DOCOL, we enter with CFA in r0.
@ - Just like DOCOL, we need to push (old) FIP for EXIT to pop.
@ - The forth words expect DFA on stack.
@
        .macro DODOES_BODY, magic=, wrap_insn:vararg=
0:      \wrap_insn PUSHRSP FIP
1:      \wrap_insn ldr FIP, [r0]
        \wrap_insn add FIP, FIP, #((2f-0b)/((1b-0b)/(4)))
        \wrap_insn add r0, r0, #4
        \wrap_insn PUSHDSP r0
        NEXT_BODY \wrap_insn
2:
        .endm
@
@ $DODOES ( -- ) emits the machine words used by DOES>.
@
defword "$DODOES",F_IMM,ASMDODOES
        DODOES_BODY ASMDODOES, COMPILE_INSN
        .int EXIT
Way 2 - partly inline, where the emitted code does an absolute branch and link. This reduces the amount of memory used per definition at the cost of a branch. Ultimately this is the solution adopted. _DODOES calculates the new FIP adjusting the return address from the branch-and-link done by the inlined bits.
_DODOES:
        PUSHRSP FIP        @ just like DOCOL, for EXIT to work
        mov FIP, lr        @ FIP now points to label 3 below
        add FIP, FIP, #4   @ add 4 to skip past ldr storage
        add r0, r0, #4     @ r0 was CFA
        PUSHDSP r0         @ need to push DFA onto stack
        NEXT
        .macro DODOES_BODY, wrap_insn:vararg=
1:      \wrap_insn ldr r12, . + ((3f-1b)/((2f-1b)/(4)))
2:      \wrap_insn blx r12
3:      \wrap_insn .long _DODOES
        .endm
@
@ $DODOES ( -- ) emits the machine words used by DOES>.
@
defword "$DODOES",F_IMM,ASMDODOES
        DODOES_BODY COMPILE_INSN
        .int EXIT
In either case, just like DOCOL, we need to push the old FIP pointer before calculating the new one. The old FIP pointer corresponds to the address within the word that called the DOES>-created word. In both cases we need to push the DFA of the executing word onto the stack (this is in r0 on the AArch32 Jonesforth).
Finally, in both cases the CREATE...DOES> word is indistinguishable from a CREATE...;CODE word, and the created word is indistinguishable from a word created by a CREATE...;CODE word.
\ This is the CREATE...;CODE $DOCON END-CODE example before.
: MKCON WORD CREATE 0 , , ;CODE ( MKCON+7 ) E590C004 E52DC004 E49A0004 E5901000 E12FFF11 (END-CODE)
CODE CON ( CODEWORD MKCON+7 ) 5 (END-CODE)
\ Fully inlined CREATE...DOES>.
: MKCON_WAY1 WORD CREATE 0 , , ;CODE ( MKCON_WAY1+7) E52BA004 E590A000 E28AA020 E2800004 E52D0004 E49A0004 E5901000 E12FFF11 9714 938C (END-CODE)
CODE CON_BY_WAY1 ( CODEWORD MKCON_WAY1+7 ) 5 (END-CODE)
\ Partly-inlined CREATE...DOES>. 
: MKCON_WAY2 WORD CREATE 0 , , ;CODE ( MKCON_WAY2+7 ) E59FC000 E12FFF3C 9F64 9714 938C (END-CODE)
CODE CON_BY_WAY2 ( CODEWORD MKCON_WAY2+7 ) 5 (END-CODE)
This makes decompiling (i.e. SEE) a bit tricky, but not impossible. As you can see here, I haven't written a good disassembler yet, which would detect these sequences as $DOCON. IMHO this is still a lesser evil than introducing new fields or flags into the word definition header.
P.S. Defining constants is a classical example of using DOES>, but a bit silly when applied to Jonesforth, where it's an intrinsic. It's an intrinsic so that certain compile-time constants, known only at assembler time, can be exposed to the Forth prelude and beyond. The other classical example of DOES> is struct-like definitions.
P.P.S. You might be wondering how I'm SEEing into code words, as neither Jonesforth nor pijFORTHos support it. I guess I'll blog about that next real soon whenever... The ( CODEWORD XXX ) business here shows the "code word" pointed to by the CFA, which is necessarily not DOCOL (otherwise it would be a regular colon definition, not CODE). The ( CODEWORD word+offset ) notation tells you that the machine words pointed to by the CFA are part of a different word. Native (jonesforth.s-defined) intrinsics would decompile as something like:
CODE 2SWAP ( CODEWORD 85BC ) (END-CODE)