R. T. RUSSELL
BBC BASIC (86) Manual
Assembler
BBCBASIC(86)
has an 8086/8088 assembler. This assembler is similar to the 6502
assembler on the BBC Micro and the Z80 assembler in BBCBASIC(Z80)
and it is entered in the same way. That is, '[' enters assembler
mode and ']' exits assembler mode. Unlike the 6502 or Z80 assemblers,
the 8086 assembler attempts to detect multiply defined labels. If
a label is found to have an existing non-zero value during the first
pass of an assembly (OPT 0 or 1), a
'Multiple label' error is reported
(error code 3).
There are differences in the use of the assembler and the interfacing
of assembly language routines with BIGBASIC. In particular, the
CALL
statement and the USR
function perform differently. The differences
are described fully in the BIGBASIC Annex to this manual.
An assembly language statement consists of three elements; an optional
label, an instruction and an operand. A comment may follow the operand
field. The instruction following a label must be separated from it
by at least one space. Similarly, the operand must also be separated
from the instruction by a space.
Statements
are terminated by a colon (:) or end of line (<RET>). When assembly
language statements are separated by colons, it is necessary to leave
a space between the colon and a preceding segment register name. See
the Segment Override sub-section for details.
Any BBCBASIC(86) numeric variable may be used as a label. These (external)
labels are defined by an assignment (count=23 for instance). Internal
labels are defined by preceding them with a full stop. When the assembler
encounters such a label, a BASIC variable is created containing the
current value of the Program Counter (P%). (The Program Counter is
described later.)
In the example shown later under the heading
The Assembly Process,
two internal labels are defined and used. Labels have the same rules
as standard BBCBASIC(86) variable names; they should start with a
letter and not start with a keyword.
You can insert comments into assembly language programs by preceding
them with a semi-colon (;) or a back-slash (\). In assembly language,
a comment ends at the end of the statement. Thus, the following example
will work (but it's a bit untidy).
[;start assembly language program
etc
MOV AX,CX ;In-line comment:POP BX ;start add
JNZ loop ;Go back if not finished:RET ;Return
etc
;end assembly language program:]
The 8086/8088 assembler generally conforms to the Intel assembly language
syntax. However, there are a number of minor differences from the
official Intel syntax. These differences are described below.
Jumps, calls and returns are assumed to be within the current code
segment. Short (8 bit displacement) jumps, far (inter segment) calls,
far jumps and far returns must be explicitly specified by using the
following mnemonics:
Short jump | JMPS |
Far call | CALLF |
Far jump | JMPF |
Far return | RETF |
Memory operands must be placed in square brackets in order to distinguish
them from immediate operands. For example,
MOV AX,[store]
will load the AX register with the contents of memory
location 'store'. However,
MOV AX,store
will load the AX register with the 16 bit address of
'store'.
The string operations must have the data size (byte or word) explicitly
specified in the instruction mnemonic as listed below.
Compare memory - byte | CMPSB |
Compare memory - word | CMPSW |
Compare AL (byte) | SCASB |
Compare AX (word) | SCASW |
Load from memory - byte | LODSB |
Load from memory - word | LODSW |
Store to memory - byte | STOSB |
Store to memory - word | STOSW |
Move byte | MOVSB |
Move word | MOVSW |
When segment overrides are necessary, they must always be entered
explicitly. The assembler will not insert them automatically. For
example,
MOV AX,CS:[data]
will load the AX register with the contents of the address
'data' in the code segment. Segment overrides will rarely be required
as BBCBASIC sets all the segment registers to the address of its data
segment on entering the assembly language code with
CALL or USR.
When
assembly language statements are separated by colons, it is necessary
to leave a space between the colon and a preceding segment register
name. If the space is missing, the assembler will misinterpret the
colon as a segment override. For example,
PUSH CS:MOV AX,0
will give rise to an error, but
PUSH CS :MOV AX,0
will be accepted.
Some assembly language instructions are ambiguous as to whether a
byte or a word value is to be acted upon. When this is so, an explicit
'byte ptr' or 'word ptr' operator must be used. For example:
INC byte ptr [BX]
MOV word ptr [count],0
If this operator is omitted, BBCBASIC(86) will issue a
'Size needed'
error message (error code 2). See the Annex entitled
Error Messages and Codes for further details.
The composite based-indexed operands are only accepted in the preferred
forms with the base register specified first. For example,
[bp+di], [bp+si], [bx+di], [bx+si]
are accepted, but
[di+bp], [si+bp], [di+bx], [si+bx]
are not.
Indexed memory operands with constant offsets are accepted in the
following formats:
[index]+offset
[index+offset]
offset[index]
Where 'index' is an index or base register such as 'bx', 'bp+si',
etc, and 'offset' is a numeric expression.
You can store constants within your assembly language program using
the define byte (DB), define word (DW) and define double- word (DD)
pseudo-operation commands. These will create 1 byte, 2 byte and 4 byte
items respectively. Define byte (DB) may alternatively be followed
by a string operand. In which case, the bytes comprising the string
will be placed in memory at the current assembly location. As discussed
later, this will be governed by P% or O%
depending on the OPTion used.
Byte Constant
DB can be used to set one byte of memory to a particular
value. For example,
.data DB 15
DB 9
will set two consecutive bytes of memory to 15 and 9 (decimal). The
address of the first byte will be stored in the variable 'data'.
String Constant
DB can be used to load a string of ASCII characters into memory. For
example,
JMPS continue; jump round the data
.string DB "This is a test message"
DB &D
.continue; and continue the process
will load the string 'This is a test message' followed
by a carriage-return into memory. The address of the start of the
message is loaded into the variable 'string'. This is equivalent
to the following program segment:
JMPS continue; jump round the data
.string; leave assembly and load the string
]
$P%="This is a test message" REM starting at P%
P%=P%+LEN($P%)+1 REM adjust P% to next free byte
[
OPT opt; reset OPT
.continue; and continue the program
DW can be used to set two bytes of memory to a particular value. The
first byte is set to the least significant byte of the number and
the second to the most significant byte. For example,
.data DW &90F
will have the same result as the Byte Constant example.
DD can be used to set four bytes of memory to a particular value. The
first byte is set to the least significant byte of the number and
the fourth to the most significant byte. For example,
.data DD &90F0D10
will have the same result as,
.data DB 16 .data DB &10
DB 13 or DB &D
DB 15 DB &F
DB 9 DB &9
Machine code instructions are assembled as if they were going to be
placed in memory at the addresses specified by the program counter,
P%. Their actual location in memory may be determined by O% depending
on the OPTion specified (see below). You must make sure that P% (or
O%) is pointing to a free area of memory before your program begins
assembly. In addition, you need to reserve the area of memory that
your machine code program will use so that it is not overwritten at
run time. You can reserve memory by using a special version of the
DIM statement or by changing
HIMEM or
LOMEM.
Using the special version of the DIM statement to reserve an area
of memory is the simplest way for short programs which do not have
to be located at a particular memory address. (See the keyword
DIM for more details.) For example,
DIM code 20: REM Note the absence of brackets
will reserve 21 bytes of code (byte 0 to byte 20) and load the variable
'code' with the start address of the reserved area. You can then
set P% (or O%) to the start of that area. The example below reserves
an area of memory 100 bytes long and sets P% to the first byte of
the reserved area.
DIM sort% 99
P%=sort%
If you are going to use a machine code program in a number
of your BBCBASIC(86) programs, the simplest way is to assemble it
once, save it using *SAVE
and load it from each of your programs using
*LOAD. In order for this to work,
the machine code program must be
loaded into the same address each time. The most convenient way to
arrange this is to move HIMEM
down by the length of the program and
load the machine code program in to this protected area. Theoretically,
you could raise LOMEM
to provide a similar protected area below your
BBCBASIC(86) program. However, altering LOMEM destroys ALL your dynamic
variables and is more risky.
You must reserve an area of memory which is sufficiently large for
your machine code program before you assemble it, but you may have
no real idea how long the program will be until after it is assembled. How
then can you know how much memory to reserve? Unfortunately, the answer
is that you can't. However, you can add to your program to find the
length used and then change the memory reserved by the
DIM statement to the correct amount.
In the example below, a large amount of memory is initially reserved. To
begin with, a single pass is made through the assembly code and the
length needed for the code is calculated (lines 100 to 120). After
a CLEAR, the correct amount of
memory is reserved (line 140) and a
further two passes of the assembly code are performed as usual. Your
program should not, of course, subsequently try to use variables set
before the clear statement. If you use a similar structure to the
example and place the program lines which initiate the assembly function
at the start of your program, you can place your assembly code anywhere
you like and still avoid this problem.
100 DIM free -1, code HIMEM-free-1000
110 PROC_ass(0)
120 L%=P%-code
130 CLEAR
140 DIM code L%
150 PROC_ass(0)
160 PROC_ass(2)
- - -
Put the rest of your program here.
- - -
1000 DEF PROC_ass(opt)
10010 P%=code
10020 [OPT opt
- - -
Assembler code program.
- - -
11000 ]
11010 ENDPROC
The program counters, P%, and O% are initialised to zero. Using the
assembler without first setting P% (and O%) is liable to corrupt BBCBASIC(86)'s
workspace (see the Annex entitled Format of Program
and Variables in Memory).
The only assembly directive is OPT.
As with the 6502 assembler,
'OPT' controls the way the assembler works, whether a listing is displayed
and whether errors are reported. OPT should be followed by a number
in the range 0 to 7. The way the assembler functions is controlled
by the three bits of this number in the following manner.
Bit 0 - LSB
Bit 0 controls the listing. If it is set, a listing is displayed.
Bit 1
Bit 1 controls the error reporting. If it is set, errors are reported.
Bit 2
Bit 2 controls where the assembled code is placed. If bit 2 is set,
code is placed in memory starting at the address specified by O%. However,
the program counter (P%) is still used by the assembler for calculating
the instruction addresses.
In general, machine code will only run properly if it is in memory
at the addresses for which it was assembled. Thus, at first glance,
the option of assembling it in a different area of memory is of little
use. However, using this facility, it is possible to build up a library
of machine code utilities for use by a number of programs. The machine
code can be assembled for a particular address by one program without
any constraints as to its actual location in memory and saved using
*SAVE. This code can then be
loaded into its working location from a number of different programs
using *LOAD.
The code is assembled using the program counter (P%) to calculate
the instruction addresses and the code is also placed in memory at
the address specified by the program counter.
OPT 0 | reports no errors and gives no listing. |
OPT 1 | reports no errors, but gives a listing. |
OPT 2 | reports errors, but gives no listing. |
OPT 3 | reports errors and gives a listing. |
The code is assembled using the program counter (P%) to calculate
the instruction addresses. However, the assembled code is placed
in memory at the address specified by O%.
OPT 4 | reports no errors and gives no listing. |
OPT 5 | reports no errors, but gives a listing. |
OPT 6 | reports errors, but gives no listing. |
OPT 7 | reports errors and gives a listing. |
The assembler works line by line through the machine code. When it
finds a label declared it generates a BBCBASIC(86) variable with that
name and loads it with the current value of the program counter (P%). This
is fine all the while labels are declared before they are used. However,
labels are often used for forward jumps and no variable with that
name would exist when it was first encountered. When this happens, a
'No such variable' error occurs.
If error reporting has not been
disabled, this error is reported and BBCBASIC(86) returns to the direct
mode in the normal way. If error reporting has been disabled
(OPT 0, 1, 4 or 5),
the current value of the program counter is used in
place of the address which would have been found in the variable,
and assembly continues. By the end of the assembly process the variable
will exist (assuming the code is correct), but this is of little use
since the assembler cannot 'back track' and correct the errors. However,
if a second pass is made through the assembly code, all the labels
will exist as variables and errors will not occur. The example below
shows the result of two passes through a (completely futile) demonstration
program. Twelve bytes of memory are reserved for the program. (If
the program was run, it would 'doom-loop' from line 50 to 70 and back
again.) The program disables error reporting by using OPT 1.
10 DIM code 12
20 FOR opt=1 TO 3 STEP 2
30 P%=code
40 [OPT opt
50 .jim JMP fred
60 DW &2345
70 .fred JMP jim
80 ]
90 NEXT
This is the first pass through the assembly process
(note that the 'JMP fred' instruction jumps to itself):
RUN
0681 OPT opt
0681 E9 FD FF .jim JMP fred
0684 45 23 DW &2345
0686 E9 F8 FF .fred JMP jim
This is the second pass through the assembly process
(note that the 'JMP fred' instruction now jumps to the correct address):
0681 OPT opt
0681 E9 02 00 .jim JMP fred
0684 45 23 DW &2345
8686 E9 F8 FF .fred JMP jim
Generally, if labels have been used, you must make two passes through
the assembly language code to resolve forward references. This can
be done using a
FOR...NEXT
loop. Normally, the first pass should be with
OPT 0 (or OPT 4) and the second pass
with OPT 2 (OPT 6). If
you want a listing, use OPT 3 (OPT7) for the second pass. During
the first pass, a table of variables giving the address of the labels
is built. Labels which have not yet been included in the table (forward
references) will generate the address of the current op-code. The
correct address will be generated during the second pass.
As mentioned earlier, you can use machine code routines in a number
of BBCBASIC(86) programs by using
*SAVE and
*LOAD. The safest way
to do this is to write a program which consists of only the machine
code routines and enough BBCBASIC(86) to assemble them. They should
be assembled 'out of the way' at the top of memory (each routine starting
at a known address) and then *SAVEd. (Don't forget to move
HIMEM
down first.) The BBCBASIC(86) programs that use these routines should
move HIMEM down to the same value before they *LOAD the assembly code
routines into the address at which they were originally assembled. *SAVE
and *LOAD are explained below.
Save an area of memory to disk. You MUST specify the start address
(aaaa) and either the length of the area of memory (llll) or its end
address+1 (bbbb).
*SAVE ufsp aaaa +llll
*SAVE ufsp aaaa bbbb
OSCLI "SAVE "+<st>+" "+STR$~(<n>)+"+"+STR$~(<n>)
*SAVE "WOMBAT" 8F00 +80
*SAVE "WOMBAT" 8F00 8F80
OSCLI "SAVE "+ufn$+" "+STR$~(add)+"+"+STR$~(len)
Load the specified file into memory at hexadecimal address 'aaaa'. The
load address MUST always be specified. OSCLI
may also be used to
load a file. However, you must take care to provide the load address
as a hexadecimal number in string format.
*LOAD ufsp aaaa
OSCLI "LOAD "+<str>+" "+STR$~<num>
*LOAD A:WOMBAT 8F00
OSCLI "LOAD "+f_name$+" "+STR$~(strt_address)
Most machine code assemblers provide conditional assembly and macro
facilities. The assembler does not directly offer these facilities,
but it is possible to implement them by using other features of BBCBASIC(86).
You may wish to write a program which makes use of special facilities
and which will be run on different types of computer. The majority
of the assembly code will be the same, but some of it will be different. In
the example below, different output routines are assembled depending
on the value of 'flag'.
DIM code 200
FOR pass=0 TO 3 STEP 3
[OPT pass
.start - - -
- - - code - - -
- - - :]
:
IF flag [OPT pass: - code for routine 1 -:]
IF NOT flag [OPT pass: - code for routine 2 - :]
:
[OPT pass
.more_code - - -
- - - code - - -
- - -:]
NEXT
Within any machine code program it is often necessary to repeat a
section of code a number of times and this can become quite tedious. You
can avoid this repetition by defining a macro which you use every
time you want to include the code. The example below uses a macro
to pass a character to the screen or the auxiliary output. Conditional
assembly is used within the macro to select either the screen or the
auxiliary output depending on the value of op_flag.
It is possible to suppress the listing of the code in a macro by forcing
bit 0 of OPT
to zero for the duration of the macro code. This can
most easily be done by ANDing the
value passed to OPT with 6. This
is illustrated in PROC_screen and PROC_aux in the example below.
DIM code 200
op_flag=TRUE
FOR pass=0 TO 3 STEP 3
[OPT pass
.start - - -
- - - code - - -
- - -
:
OPT FN_select(op_flag); Include code depending on op_flag
:
- - -
- - - code - - -
- - -:]
NEXT
END
:
:
REM Include code depending on value of op_flag
:
DEF FN_select(op_flag)
IF op_flag PROC_screen ELSE PROC_aux
=pass
REM Return original value of OPT. This is a
REM bit artificial, but necessary to insert
REM some BBCBASIC(86) code in the assembly code.
:
DEF PROC_screen
[OPT pass AND 6
MOV DL,AL
MOV AH,02
INT &21:]
ENDPROC
:
DEF PROC_aux
[OPT pass AND 6
MOV DL,AL
MOV AH,04
INT &21:]
ENDPROC
The use of a function call to incorporate the code provides a neat
way of incorporating the macro within the program and allows parameters
to be passed to it. The function should return the original value
of OPT.
© Doug Mounter and
Richard Russell