Use an assembler for FlashForth
published: 29 April 2021 / updated 29 April 2021
An assembler written in FORTH language
FlashForth has an assembler written in FORTH language, available here:
FlashForth assembler for Atmega chips
To use this assembler, run FlashForth on your ARDUINO card. Copy this code
source and inject it into the terminal. Once compiled, type words
. You have to
see the words of the assembler:
Before using this assembler, let's see its syntax to fully exploit it.
FlashForth assembler syntax
Consider the assembly code of a FORTH word as described in the source code of FlashForth in the ff-atmega.asm file:
.db NFA|INLINE4|1, "+" PLUS: ld t0, Y+ ld t1, Y+ add tosl, t0 adc tosh, t1 ret
Some explanations:
- The first line, .db NFA|INLINE4|1,"+" matches
the assembly of the header of the word
+
in the FORTH dictionary from FlashForth; - The second line, PLUS: corresponds to the definition of a label. This memorizes the memory location, then allowing a possible call via a connection or call from subroutine;
- The other lines correspond to the AVR assembly language code of the word
+
. It is these five lines that we will analyze and translate them into a compiled FORTH word via our assembler written in FORTH language.
In a classic assembler, each line begins with a mnemonic. This mnemonic is very short and corresponds to an instruction. Example: ld is a short form of load:
- ld load Indirect from Data Space to Register using Index
- ad addition, adc Add with Carry, add Add without Carry
- ret Return from Subroutine
In the assembly source code ld t0, Y+, LD is followed by the operands t0 and Y+. t0 is an alias for the R16 registry, t1 is an alias for the R17 registry.
Here is how we can define t0 and t1:
r16 constant t0 r17 constant t1
In the FlashForth assembler, we first specify the operands, then the operator. In conventional assembler:
ld t0, Y+
Will be rewritten in FlashForth assembler:
t0 Y+ ld,
Here is how the assembly code of +
:
t0 Y+ ld, t1 Y+ ld, tosl t0 add, tosh t1 adc, ret,
Creating a FlashForth assembler code word
FlashForth does not have a definition word specifically dedicated to creation.
of words whose content would be written in FORTH assembler. This is due to the particular strure of
definitions of words that are already in assembler. Indeed, the CFA of a definition
of word created by :
starts immediately in the machine code!
To understand the structure of the definitions of FORTH words in the dictionary
FORTH, refer to this article:
FORTH Dictionary and Word Definitions Structure
List of mnemonics used by the FORTH assembler of FlashForth:
Assembly words for FlashForth
To use the FORTH assembler for FlashForth, it is necessary and sufficient to switch to mode
interpreted using the word [
in the definition of a word declared by :
, example:
R16 constant t0
R17 constant t1
R24 constant tosl
R25 constant tosh
: PLUS ( n1 n2 -- n1+n2)
[
t0 Y+ ld,
t1 Y+ ld,
tosl t0 add,
tosh t1 adc,
ret,
] ;
You can now use the word PLUS
in exactly the same way
that the word +
.
Go further with the FORTH assembler for FlashForth
Compiling a FORTH assembler is sacrificing part of the flash memory to the detriment of the final program. Because once the code is assembled, the assembler becomes unnecessary. So is he possible to compile machine code directly in its binary form without going through the assembler?
It's possible!
Compiling the binary code
To compile the binary code, it must first be extracted from a word that has been
assembled with our FORTH assembler for FlashForth. In our example of the word
PLUS
, we know that we have assembled 5 mnemonics. The AVR code as well
generated takes 10 bytes placed in the CFA and following address of our word PLUS
.
We will proceed to a hexadecimal dump of this memory portion:
hex ' PLUS 20 dump 8d66 :09 91 19 91 80 0f 91 1f 08 95 08 95 ff ff ff ff ................ 8d76 :ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ ok<$,ram
Each byte pair corresponds to an instruction, the first byte being the byte
least significant, the second the most significant byte of this 16-bit instruction:
09 91 => 9109
We can therefore rewrite our word PLUS
like this:
: PLUS ( n1 n2 -- n1+n2) [ $9109 i, \ t0 Y+ ld, $9119 i, \ t1 Y+ ld, $0f80 i, \ tosl t0 add, $1f91 i, \ tosh t1 adc, ] ;
The last instruction corresponding to ret,
has been removed because it
did double duty with the subroutine return compiled by the word ;
(08 95).
Using gForth as a FlashForth meta-assembler
The other solution, to generate assembled code without loading the FORTH assembler
on an ARDUINO card is to use the gForth application and the cross assembler
Xassembler.txt available here:
FlashForth assembler for Atmega chips.
Once the code has been compiled with gForth, we are able to generate binary code for our ARDUINO card:
Here in red the hexadecimal code that interests us:
R16 constant t0 ok R17 constant t1 ok R24 constant tosl ok R25 constant tosh ok : PLUS ( n1 n2 -- n1+n2) compiled [ ok 5 t0 Y+ ld, 9109 ok 5 t1 Y+ ld, 9119 ok 5 tosl t0 add, 0F80 ok 5 tosh t1 adc, 1F91 ok 5 ret, 9508 ok 5 ] ; ok
Using assembly directives
The version of the FORTH assembler for FlashForth from the file Xassembler.txt can be used alone or in addition to a meta-compiler.
This version incorporates assembly guidelines to reproduce the directives specific to the AVR assembler. Example of source code with assembly directives:
.def tosl = r24 .def tosh = r25 ; Macros .macro poptos ld tosl, Y+ ld tosh, Y+ .endmacro
Adaptation in FORTH assembler:
r24 .def tosl
r25 .def tosh
\ Macros
.macro poptos
tosl Y+ ld,
tosh Y+ ld,
.endmacro
List of available directives:
.byte .db .def .dw .equ .set HIGH LOW
Note: there is no question of orienting FORTH to assimilate an assembler syntax pure. FORTH has other assets to manage efficient programming by integrating specific tools, meta-compilation in particular.
The use of an assembly language should be reserved for interface problems with a specific processor and the definition of primitives could not be defined in FORTH language.