R. T. RUSSELL

BBC BASIC (86) Manual

Assembler

Introduction

BBCBASIC(86) has an 8086/8088 assembler. This assembler is similar to the 6502 assembler on the BBC Micro and the Z80 assembler in BBCBASIC(Z80) and it is entered in the same way. That is, '[' enters assembler mode and ']' exits assembler mode. Unlike the 6502 or Z80 assemblers, the 8086 assembler attempts to detect multiply defined labels. If a label is found to have an existing non-zero value during the first pass of an assembly (OPT 0 or 1), a 'Multiple label' error is reported (error code 3).

BIGBASIC

There are differences in the use of the assembler and the interfacing of assembly language routines with BIGBASIC. In particular, the CALL statement and the USR function perform differently. The differences are described fully in the BIGBASIC Annex to this manual.

Statements

An assembly language statement consists of three elements; an optional label, an instruction and an operand. A comment may follow the operand field. The instruction following a label must be separated from it by at least one space. Similarly, the operand must also be separated from the instruction by a space.

Statements are terminated by a colon (:) or end of line (<RET>). When assembly language statements are separated by colons, it is necessary to leave a space between the colon and a preceding segment register name. See the Segment Override sub-section for details.

Labels

Any BBCBASIC(86) numeric variable may be used as a label. These (external) labels are defined by an assignment (count=23 for instance). Internal labels are defined by preceding them with a full stop. When the assembler encounters such a label, a BASIC variable is created containing the current value of the Program Counter (P%). (The Program Counter is described later.)

In the example shown later under the heading The Assembly Process, two internal labels are defined and used. Labels have the same rules as standard BBCBASIC(86) variable names; they should start with a letter and not start with a keyword.

Comments

You can insert comments into assembly language programs by preceding them with a semi-colon (;) or a back-slash (\). In assembly language, a comment ends at the end of the statement. Thus, the following example will work (but it's a bit untidy).

[;start assembly language program
etc
MOV AX,CX ;In-line comment:POP BX ;start add
JNZ loop ;Go back if not finished:RET ;Return
etc
;end assembly language program:]

Language Syntax

Differences from The Intel Syntax

The 8086/8088 assembler generally conforms to the Intel assembly language syntax. However, there are a number of minor differences from the official Intel syntax. These differences are described below.

Jumps and Calls

Jumps, calls and returns are assumed to be within the current code segment. Short (8 bit displacement) jumps, far (inter segment) calls, far jumps and far returns must be explicitly specified by using the following mnemonics:

Short jump JMPS

Far call CALLF

Far jump JMPF

Far return RETF

Memory Operands

Memory operands must be placed in square brackets in order to distinguish them from immediate operands. For example,

MOV AX,[store]

will load the AX register with the contents of memory location 'store'. However,

MOV AX,store

will load the AX register with the 16 bit address of 'store'.

String Operations

The string operations must have the data size (byte or word) explicitly specified in the instruction mnemonic as listed below.

Compare memory - byte CMPSB

Compare memory - word CMPSW

Compare AL (byte) SCASB

Compare AX (word) SCASW

Load from memory - byte LODSB

Load from memory - word LODSW

Store to memory - byte STOSB

Store to memory - word STOSW

Move byte MOVSB

Move word MOVSW

Segment Override

When segment overrides are necessary, they must always be entered explicitly. The assembler will not insert them automatically. For example,

MOV AX,CS:[data]

will load the AX register with the contents of the address 'data' in the code segment. Segment overrides will rarely be required as BBCBASIC sets all the segment registers to the address of its data segment on entering the assembly language code with CALL or USR.

When assembly language statements are separated by colons, it is necessary to leave a space between the colon and a preceding segment register name. If the space is missing, the assembler will misinterpret the colon as a segment override. For example,

PUSH CS:MOV AX,0

will give rise to an error, but

PUSH CS :MOV AX,0

will be accepted.

Byte/Word Ambiguities

Some assembly language instructions are ambiguous as to whether a byte or a word value is to be acted upon. When this is so, an explicit 'byte ptr' or 'word ptr' operator must be used. For example:

INC byte ptr [BX]
MOV word ptr [count],0

If this operator is omitted, BBCBASIC(86) will issue a 'Size needed' error message (error code 2). See the Annex entitled Error Messages and Codes for further details.

Based-Indexed Operands

The composite based-indexed operands are only accepted in the preferred forms with the base register specified first. For example,

[bp+di], [bp+si], [bx+di], [bx+si]

are accepted, but

[di+bp], [si+bp], [di+bx], [si+bx]

are not.

Indexed Memory Operands

Indexed memory operands with constant offsets are accepted in the following formats:

[index]+offset
[index+offset]
offset[index]

Where 'index' is an index or base register such as 'bx', 'bp+si', etc, and 'offset' is a numeric expression.

Word and String Constants

You can store constants within your assembly language program using the define byte (DB), define word (DW) and define double- word (DD) pseudo-operation commands. These will create 1 byte, 2 byte and 4 byte items respectively. Define byte (DB) may alternatively be followed by a string operand. In which case, the bytes comprising the string will be placed in memory at the current assembly location. As discussed later, this will be governed by P% or O% depending on the OPTion used.

Define Byte - DB

Byte Constant

DB can be used to set one byte of memory to a particular value. For example,

.data DB 15
      DB 9

will set two consecutive bytes of memory to 15 and 9 (decimal). The address of the first byte will be stored in the variable 'data'.

String Constant

DB can be used to load a string of ASCII characters into memory. For example,

JMPS continue; jump round the data
.string DB "This is a test message"
DB &D
.continue; and continue the process

will load the string 'This is a test message' followed by a carriage-return into memory. The address of the start of the message is loaded into the variable 'string'. This is equivalent to the following program segment:

JMPS continue;	jump round the data
.string;	leave assembly and load the string
]
$P%="This is a test message" REM starting at P%
P%=P%+LEN($P%)+1 REM adjust P% to next free byte
[
OPT opt; reset OPT
.continue;	and continue the program

Define Word - DW

DW can be used to set two bytes of memory to a particular value. The first byte is set to the least significant byte of the number and the second to the most significant byte. For example,

.data DW &90F

will have the same result as the Byte Constant example.

Define Double Word - DD

DD can be used to set four bytes of memory to a particular value. The first byte is set to the least significant byte of the number and the fourth to the most significant byte. For example,

.data DD &90F0D10

will have the same result as,

.data DB 16       .data DB &10
      DB 13   or        DB &D
      DB 15             DB &F
      DB 9              DB &9

Reserving Memory

The Program Counter

Machine code instructions are assembled as if they were going to be placed in memory at the addresses specified by the program counter, P%. Their actual location in memory may be determined by O% depending on the OPTion specified (see below). You must make sure that P% (or O%) is pointing to a free area of memory before your program begins assembly. In addition, you need to reserve the area of memory that your machine code program will use so that it is not overwritten at run time. You can reserve memory by using a special version of the DIM statement or by changing HIMEM or LOMEM.

Using DIM to Reserve Memory

Using the special version of the DIM statement to reserve an area of memory is the simplest way for short programs which do not have to be located at a particular memory address. (See the keyword DIM for more details.) For example,

DIM code 20: REM Note the absence of brackets

will reserve 21 bytes of code (byte 0 to byte 20) and load the variable 'code' with the start address of the reserved area. You can then set P% (or O%) to the start of that area. The example below reserves an area of memory 100 bytes long and sets P% to the first byte of the reserved area.

DIM sort% 99
P%=sort%

Moving HIMEM to Reserve Memory

If you are going to use a machine code program in a number of your BBCBASIC(86) programs, the simplest way is to assemble it once, save it using *SAVE and load it from each of your programs using *LOAD. In order for this to work, the machine code program must be loaded into the same address each time. The most convenient way to arrange this is to move HIMEM down by the length of the program and load the machine code program in to this protected area. Theoretically, you could raise LOMEM to provide a similar protected area below your BBCBASIC(86) program. However, altering LOMEM destroys ALL your dynamic variables and is more risky.

Length of Reserved Memory

You must reserve an area of memory which is sufficiently large for your machine code program before you assemble it, but you may have no real idea how long the program will be until after it is assembled. How then can you know how much memory to reserve? Unfortunately, the answer is that you can't. However, you can add to your program to find the length used and then change the memory reserved by the DIM statement to the correct amount.

In the example below, a large amount of memory is initially reserved. To begin with, a single pass is made through the assembly code and the length needed for the code is calculated (lines 100 to 120). After a CLEAR, the correct amount of memory is reserved (line 140) and a further two passes of the assembly code are performed as usual. Your program should not, of course, subsequently try to use variables set before the clear statement. If you use a similar structure to the example and place the program lines which initiate the assembly function at the start of your program, you can place your assembly code anywhere you like and still avoid this problem.

100 DIM free -1, code HIMEM-free-1000
110 PROC_ass(0)
120 L%=P%-code
130 CLEAR
140 DIM code L%
150 PROC_ass(0)
160 PROC_ass(2)

- - -
Put the rest of your program here.
- - -

1000 DEF PROC_ass(opt)
10010 P%=code
10020 [OPT opt
- - -
Assembler code program.
- - -

11000 ]
11010 ENDPROC

Initial Setting of the Program Counter

The program counters, P%, and O% are initialised to zero. Using the assembler without first setting P% (and O%) is liable to corrupt BBCBASIC(86)'s workspace (see the Annex entitled Format of Program and Variables in Memory).

The Assembly Process

OPT

The only assembly directive is OPT. As with the 6502 assembler, 'OPT' controls the way the assembler works, whether a listing is displayed and whether errors are reported. OPT should be followed by a number in the range 0 to 7. The way the assembler functions is controlled by the three bits of this number in the following manner.

Bit 0 - LSB

Bit 0 controls the listing. If it is set, a listing is displayed.

Bit 1

Bit 1 controls the error reporting. If it is set, errors are reported.

Bit 2

Bit 2 controls where the assembled code is placed. If bit 2 is set, code is placed in memory starting at the address specified by O%. However, the program counter (P%) is still used by the assembler for calculating the instruction addresses.

Assembly at a Different Address

In general, machine code will only run properly if it is in memory at the addresses for which it was assembled. Thus, at first glance, the option of assembling it in a different area of memory is of little use. However, using this facility, it is possible to build up a library of machine code utilities for use by a number of programs. The machine code can be assembled for a particular address by one program without any constraints as to its actual location in memory and saved using *SAVE. This code can then be loaded into its working location from a number of different programs using *LOAD.

OPT Summary

Code Assembled Starting at P%

The code is assembled using the program counter (P%) to calculate the instruction addresses and the code is also placed in memory at the address specified by the program counter.

OPT 0 reports no errors and gives no listing.

OPT 1 reports no errors, but gives a listing.

OPT 2 reports errors, but gives no listing.

OPT 3 reports errors and gives a listing.

Code Assembled Starting at O%

The code is assembled using the program counter (P%) to calculate the instruction addresses. However, the assembled code is placed in memory at the address specified by O%.

OPT 4 reports no errors and gives no listing.

OPT 5 reports no errors, but gives a listing.

OPT 6 reports errors, but gives no listing.

OPT 7 reports errors and gives a listing.

How the Assembler Works

The assembler works line by line through the machine code. When it finds a label declared it generates a BBCBASIC(86) variable with that name and loads it with the current value of the program counter (P%). This is fine all the while labels are declared before they are used. However, labels are often used for forward jumps and no variable with that name would exist when it was first encountered. When this happens, a 'No such variable' error occurs. If error reporting has not been disabled, this error is reported and BBCBASIC(86) returns to the direct mode in the normal way. If error reporting has been disabled (OPT 0, 1, 4 or 5), the current value of the program counter is used in place of the address which would have been found in the variable, and assembly continues. By the end of the assembly process the variable will exist (assuming the code is correct), but this is of little use since the assembler cannot 'back track' and correct the errors. However, if a second pass is made through the assembly code, all the labels will exist as variables and errors will not occur. The example below shows the result of two passes through a (completely futile) demonstration program. Twelve bytes of memory are reserved for the program. (If the program was run, it would 'doom-loop' from line 50 to 70 and back again.) The program disables error reporting by using OPT 1.

10 DIM code 12
20 FOR opt=1 TO 3 STEP 2
30 P%=code
40 [OPT opt
50 .jim JMP fred
60 DW &2345
70 .fred JMP jim
80 ]
90 NEXT

This is the first pass through the assembly process (note that the 'JMP fred' instruction jumps to itself):

RUN
0681                    OPT opt
0681 E9 FD FF   .jim    JMP fred
0684 45 23              DW &2345
0686 E9 F8 FF   .fred   JMP jim

This is the second pass through the assembly process (note that the 'JMP fred' instruction now jumps to the correct address):

0681                    OPT opt
0681 E9 02 00   .jim    JMP fred
0684 45 23              DW &2345
8686 E9 F8 FF   .fred   JMP jim

Generally, if labels have been used, you must make two passes through the assembly language code to resolve forward references. This can be done using a FOR...NEXT loop. Normally, the first pass should be with OPT 0 (or OPT 4) and the second pass with OPT 2 (OPT 6). If you want a listing, use OPT 3 (OPT7) for the second pass. During the first pass, a table of variables giving the address of the labels is built. Labels which have not yet been included in the table (forward references) will generate the address of the current op-code. The correct address will be generated during the second pass.

Saving and Loading Machine Code Programs

As mentioned earlier, you can use machine code routines in a number of BBCBASIC(86) programs by using *SAVE and *LOAD. The safest way to do this is to write a program which consists of only the machine code routines and enough BBCBASIC(86) to assemble them. They should be assembled 'out of the way' at the top of memory (each routine starting at a known address) and then *SAVEd. (Don't forget to move HIMEM down first.) The BBCBASIC(86) programs that use these routines should move HIMEM down to the same value before they *LOAD the assembly code routines into the address at which they were originally assembled. *SAVE and *LOAD are explained below.

*SAVE

Save an area of memory to disk. You MUST specify the start address (aaaa) and either the length of the area of memory (llll) or its end address+1 (bbbb).

*SAVE ufsp aaaa +llll
*SAVE ufsp aaaa bbbb
OSCLI "SAVE "+<st>+" "+STR$~(<n>)+"+"+STR$~(<n>)
*SAVE "WOMBAT" 8F00 +80
*SAVE "WOMBAT" 8F00 8F80
OSCLI "SAVE "+ufn$+" "+STR$~(add)+"+"+STR$~(len)

*LOAD

Load the specified file into memory at hexadecimal address 'aaaa'. The load address MUST always be specified. OSCLI may also be used to load a file. However, you must take care to provide the load address as a hexadecimal number in string format.

*LOAD ufsp aaaa
OSCLI "LOAD "+<str>+" "+STR$~<num>

*LOAD A:WOMBAT 8F00
OSCLI "LOAD "+f_name$+" "+STR$~(strt_address)

Conditional Assembly and Macros

Introduction

Most machine code assemblers provide conditional assembly and macro facilities. The assembler does not directly offer these facilities, but it is possible to implement them by using other features of BBCBASIC(86).

Conditional Assembly

You may wish to write a program which makes use of special facilities and which will be run on different types of computer. The majority of the assembly code will be the same, but some of it will be different. In the example below, different output routines are assembled depending on the value of 'flag'.

DIM code 200
FOR pass=0 TO 3 STEP 3
  [OPT pass
  .start     - - -
             - - - code - - -
             - - - :]
  :
  IF flag  [OPT  pass: - code for routine 1 -:]
  IF NOT flag [OPT pass: - code for routine 2 - :]
  :
  [OPT pass
  .more_code - - -
             - - - code - - -
             - - -:]
NEXT

Macros

Within any machine code program it is often necessary to repeat a section of code a number of times and this can become quite tedious. You can avoid this repetition by defining a macro which you use every time you want to include the code. The example below uses a macro to pass a character to the screen or the auxiliary output. Conditional assembly is used within the macro to select either the screen or the auxiliary output depending on the value of op_flag.

It is possible to suppress the listing of the code in a macro by forcing bit 0 of OPT to zero for the duration of the macro code. This can most easily be done by ANDing the value passed to OPT with 6. This is illustrated in PROC_screen and PROC_aux in the example below.

DIM code 200
op_flag=TRUE
FOR pass=0 TO 3 STEP 3
  [OPT pass
  .start   - - -
           - - - code - - -
           - - -
: 
  OPT FN_select(op_flag); Include code depending on op_flag
:
           - - -
           - - - code - - -
           - - -:]
NEXT
END
:
:
REM Include code depending on value of op_flag
:
DEF FN_select(op_flag)
IF op_flag PROC_screen ELSE PROC_aux
=pass
REM Return original value of OPT.  This is a
REM bit artificial, but necessary to insert
REM some BBCBASIC(86) code in the assembly code.
:
DEF PROC_screen
[OPT pass AND 6
MOV DL,AL
MOV AH,02
INT &21:]
ENDPROC
:
DEF PROC_aux
[OPT pass AND 6
MOV DL,AL
MOV AH,04
INT &21:]
ENDPROC

The use of a function call to incorporate the code provides a neat way of incorporating the macro within the program and allows parameters to be passed to it. The function should return the original value of OPT.

CONTENTS

CONTINUE

Short jump	`JMPS`
Far call	`CALLF`
Far jump	`JMPF`
Far return	`RETF`

Compare memory - byte	`CMPSB`
Compare memory - word	`CMPSW`
Compare AL (byte)	`SCASB`
Compare AX (word)	`SCASW`
Load from memory - byte	`LODSB`
Load from memory - word	`LODSW`
Store to memory - byte	`STOSB`
Store to memory - word	`STOSW`
Move byte	`MOVSB`
Move word	`MOVSW`

`OPT 0`	reports no errors and gives no listing.
`OPT 1`	reports no errors, but gives a listing.
`OPT 2`	reports errors, but gives no listing.
`OPT 3`	reports errors and gives a listing.

`OPT 4`	reports no errors and gives no listing.
`OPT 5`	reports no errors, but gives a listing.
`OPT 6`	reports errors, but gives no listing.
`OPT 7`	reports errors and gives a listing.