6809 Emulator

Assembly Language FAQs

Some simple questions about how to translate high level language concepts into the idiom of assembly lnaguage.

Why are there so many instructions?

There are only really a few dozen instrucions in 6809 assembly language. At first it can be confusing because registers are tagged onto the end of operations without any punctuation. So LDA, STB, LDD, CMPX, LDY, STU and CMPS can seem like a mysterious random jumble rather than just three basic operations on different registers.

When faced with an unfamiliar instruction, try to remove a register from the end to see what is the real operation:

orb #20     ; Or B #20
addd #400   ; Add D #400
sty $1000   ; St Y $1000 
inca        ; Inc A
comb        ; Com B
pshu        ; Psh U
sex         ; *not* Se X, but Sign EXtend
daa         ; *not* Da A, but Decimal Addition Adjust

Which registers do I use?

For an address variable pick an index register, X, Y or even (user stack) U, it doesn't matter which. For a character or small numeric variable pick an accumulator, A or B, again it doesn't matter which. If you need 16-bit values use D, which is the two accumulators combined (A high, B low).

Choice between index or accumulator severely limits the operations you can perform. Only LD, ST, and CMP are common between all user registers. General arithmetic operations (NEG, AND etc.) are limited to the 8-bit accumulators, except for ADD and SUB which are available to 16-bit D. Index registers uniquely have the LEAX operation, and the stacks (U and S) have PSH and PUL.

Don't use the system registers (PC, CC, DP, S) directly unless you know exactly what you're doing.

In practice there are minor differences between registers. One accumulator may have a specialised operation not available to the other, and the LEA instruction works a little differently for stacks so you can't use LEAU -1,U as a loop counter. Also X is the preferred index register as its instructions are sometimes coded more efficiently

How do i declare, assign, and modify variables?

Reserve space for a global variable with the RMB (reserve memory bytes) directive. This isn't a machine language instruction, rather it simply tells the assembler how to arrange space in program memory. Remember that only byte (8-bit) and word (16-bit) values can be handled by the CPU in a single operation. More complex variables are referenced by their (16-bit) addressses.

Variables are fetched into a register with the LD (load) operation, the registers worked upon as needed, and updated in memory with the ST (store) operation. A useful operation for initialising string and other structured variables is LEAX (load effective address).

nxtChr   rmb 1       ; VAR nxtChr:byte
nxtLine  rmb 2       ; VAR nxtLine:word 
strAdr   rmb 2       ; VAR strAdr:^string
msgLen   fcb 5       ; VAR msgLen:byte=5 
msgTxt   fcb "Hello" ; CONST msgTxt="Hello" 
maxLine  fdb 1000    ; VAR maxLine:word=1000

lda #space        
sta nxtChr         ; nxtChr:=32
ldd #1  
std nxtLine        ; nxtLine:=1
leax msgTxt,pcr    ; address of string via PC relative mode
stx strAdr         ; strAdr:=@msgTxt
...
ldx strAdr         ; retrieve pointer  
lda ,x+            ; read character and increment pointer 
ora #$20
sta nxtChr         ; store character
stx strAdr         ; store pointer

A variable can be initialised at assembly time with the FCB (format constant byte) and FDB (format double byte) directives. This isn't a good idea except for constant data as the initialisation happens only when the program is first loaded, and not when it is run a second time.

What is a bigendian?

The 6809 is described by the Swiftian term “big-endian”; the Intel 8080, like many early micros, is “little-endian”. This refers to the way 16-bit values are stored in memory - big byte first, or little byte first.

It may seem the less sensible concept (which gets sillier as the design decision persists to 64-bithood), but it's simpler to design an 8-bit microprocessor as little-endian. For example, to add a 16-bit constant 8 bits at a time, storing the low byte first makes it easier to deal with the carry. If there was no ADDD instruction for the 6809 we might do this:

  addb ,x+      ; add low byte
  adca ,x+      ; add high byte, using the carry

The assembler will take care of the byte ordering for us when we specify 16-bit constants, but it's something to bear in mind when converting programs between micros.

What does the LEA instruction do?

The LEA (Load Effective Address) instruction is used to initialise and modify index registers, ie. address variables. Only indexed addressing modes are available.

The jargon term 'effective affress', in the context of a LDA instruction for example, is simply the address from which the register A is loaded:

  ldx #$0400
  ldb #$10
  lda $20,x   ; Effective Address is X+$20, ie. $0420
  lda b,x     ; Effective Address is X+B, ie. $0410

Therefore a LEA instruction is simply performing addition:

  leax $20,x  ; X:=X+32
  leax b,x    ; X:=X+B

Registers can be mixed as needed:

  leay ,x     ; Y:=X 
  leau 4,s    ; U:=S+4

Any indexed mode can be used, so index registers can be initialised using PC relative mode:

helloStr fcb "Hello World",0
  leax helloStr,pcr      ; point X to string
  lda ,x+                ; A contains 72 (ASCII 'H')

Don't use the autoincrement modes, stick to constant offsets. Only the Zero flag of the condition codes is modified, and only for assignments to X and Y. This is so the stacks can be adjusted at the end of a subroutine without affecting the zero flag which might be checked by the caller.

  leax ,x++   ; does nothing
  leay ,-y    ; not recommended
  leay -1,y   ; approved loop counter
  bne loop

How do I call a subroutine?

Use the JSR (Jump to SubRoutine) or BSR (Branch to SubRoutine) instructions. How to pass parameters depends entirely on how the subroutine was written. Typically a parameter is loaded into an accumulator, or its address is loaded into an index register.

  lda xPos
  ldb yPos
  leax bmapShip,pcr
  jsr drawBmap
moreProgram:
  adda #20

A subroutine continues until it reaches a RTS (ReTurn from Subroutine) instruction, at which point execution continues with the instruction following the JSR. Results are returned the same way as parameters are passed. Setting the condition codes is also an option, such as setting the Carry flag in case of an error.

drawBmap:
  pshs a,b,y        ; save registers as needed
  ldy #screenBase
  lsra
  leay a,y
  ...
failure:
  orcc #1
finish:        
  puls a,b,y        ; we didn't save X as we don't expect the caller to reuse it 
  rts

Subroutines work by pushing the PC (Program Counter), which always contains the address of the next instruction, onto the system stack (S register), then loading the PC with the destination address. The next RTS instruction should simply PULS the PC's value from the stack. Note that this won't work if the stack has been modified and not correctly restored.

It's often a good idea to save and restore the values or register that are modified during the subroutine. If the last instruction before the RTS is a PULS we can optimise by combining the two:

  pshs x,y       ; save registers
  leax 2,x
  leay -4,y
  ...
  puls x,y,pc    ; restore registers and return

What is a stack and how do I (safely) use it?

A stack is a crucial concept of machine language programming. Basically it's a way of storing information and retrieving it later without too much worry about the fine details. You can PuSH registers onto the stack and PulL (sometimes known as POP) them off later. As long as pushing and pulling is done in matching order (like matching the brackets in an arithmetical expression) all we need to know is the name of the register (S or U in the case of the 6809).

People like to use physical analogies. For example, you're at your desk scribbling your memoirs when you need to check a detail in those photos you took on holiday in Ibiza. So you put your papers aside (onto the “stack”) to make room. Then the phone rings and you need to get out your appointment book, so the photos go onto the “stack”. And then there's someone at the door… this could go on indefinitely, though there is ultimately a physical limit to the size of the stack.

The point is that you can easily imagine retrieving your papers from the “stack” and smoothly carrying on with the last job after every interruption. It only needs a little order in your working methods. And if you have a whole program written down beside you, and the current line number is just another detail that can be jotted down and pushed on the stack, nothing could possibly go wrong.

The system stack in a 6809 system is pointed to by the register S. Typically it will be initialised by the operating system at the highest address in RAM (the user program doesn't usually set it). Pushing a byte means first decrementing S by 1, then storing the byte into address S. Doing this twice pushes a 16-bit word. To pull a byte, read its value from address S then increment S by 1. Note that this is identical to using the 6809's autoincrement/decrement modes.

There are two stacks on the 6809, U (User) and S (system); normally we don't bother much with U. Any combination of registers can be pushed and pulled in a single instruction. With one exception: we can't stack a stack register onto itself, which would have little point anyway.

  pshs a,b,x,y,u
  ldd #loopCount
loop:  
  pshs d
  lda ,x+
  cmpa ,y+
  ...
  puls d
  subd #1
  bne loop  
  puls a,b,x,y,u,pc

If we PSHS A, modify A, then PULS A we get the same value back, and we don't have to care what was done to the stack before the push. Equally, our “modify A” part might have included PSHS X,Y followed by PULS X,Y, and we would not notice. But we must do our stacking in matching pairs with no overlap; PSHS A then PSHS X,Y then PULS A will not restore A.

Calling subroutines relies on the system stack. A JSR pushes the Program Counter (PC) onto the stack, and a RTS pulls it. Thus the program can continue where it left off when the subroutine is over. Instead of using RTS, a subroutine might add the PC to a list of pulls at the end of a subroutine.

Stacking order is always the same, regardless of how the assembly language instruction is written. For example PSHS Y,B,A,X will be retrieved correctly by PULS D,Y,X. When all the registers are stacked on S it will look like this:

    CC    A    B   DP    X         Y         U        PC 
S+   0 |  1 |  2 |  3 |  4 |  5 |  6 |  7 |  8 | 9  | 10 | 11 | 12

The S register can be used as an index register just like any other to access these values individually. But don't use negative offsets as you can't trust anything below the stack; the system may overwrite these bytes at any time, literally between instructions.

You can use the LEA instruction to reserve space on the stack; LEAS -12,S is like a push, LEAS 12,S a pull. As before, you must do this in matching pairs.

  pshs d,x,y,u
  leas -4,s         ; reserve 4 bytes to do what we like with
  clr 2,s
  ...
  lda #4
  pshs a
  lda 3,s           ; the same address we cleared above
  ...
  puls a            ; A equals 4
  leas 4,s
  puls d,x,y,u,pc

The dangers of using the stack:

There's nothing to stop it expanding to overwrite programs and data, but this shouldn't happen in practice.
Using the S register for our own purposes when a system interrupt may occur, overwriting our data seemingly at random.
Generally not doing pushing and pulling in matching pairs and mixing up the data, with particular reference to return addresses.

How do I do something a hundred times over?

In a high-level language, typically you would set a variable to zero, then count up to a hundred:

n:=0;
REPEAT
  writeln;
  n:=n+1
UNTIL n=100

Assembly language prefers an idiom where you start at a hundred and count down to zero:

  ldb #100
repeatLoop:
  lbsr writeln
  decb
  bne repeatLoop

Assuming the value of the loop counter isn't used anywhere, it's a little simpler this way.

How do I write an IF / THEN statement?

The easy way is to reverse the logic. So if you had:

IF xPos>100 THEN xPos:=100

Imagine instead it was:

IF xPos<=100 SKIP xPos:=100

In assembly language that could be:

  lda xPos
  cmpa #100      ; IF
  bls noLimit    ; SKIP 
  lda #100       ; optional
  sta xPos       ; statements 
noLimit:
  lda yPos       ; continue with program

How do I write a REPEAT / UNTIL loop?

Simply write the body, then a comparison, and a conditional branch back to the start of the body.

REPEAT
  plotPixel (left, top)
  left:=left+1
UNTIL left>right

Becomes:

  lda left
  ldb top 
plotLoop:          ; REPEAT
  jsr plotPixel
  inca
  cmpa right        
  bls plotLoop     ; UNTIL
  ldb bottom       ; continue

How do I write a WHILE loop?

This is very similar to REPEAT, except the comparison has to be made at the start of the loop. This is done simply by preceding the loop with a branch to the conditional test at the end:

WHILE left<=right DO
  plotPixel (left, top)
  left:=left+1

Becomes:

  lda left    
  ldb top
  bra plotTest
plotLoop:          ; DO    
  jsr plotPixel
  inca
plotTest:    
  cmpa right
  bls plotLoop     ; WHILE
  ldb bottom       ; continue

How do I access an array?

Generally you won't be using arrays in quite the same way as with high level languages. But if you need to, the programmer must do all the work of reserving memory and calculating offset addresses. For an array of bytes this is simple; point to its start address with an index register and use an offset:

numList rmb 100
  leax numList,pcr    ; point to array
  ldb #99             ; last element
  lda b,x             ; A:=numList[99]

For an array of addresses (two bytes each) the offset has to be doubled by bit-shifting. Note that the range of offsets is limited to -128 to 127 with an 8-bit offset so D may have to be used.

adrList rmb 400
  leax adrList,pcr
  ldb #199
  clra
  lslb
  rola
  ldy d,x             ; Y=adrList[199]

Two-dimensional arrays can be accessed using the MUL (unsigned MULtiply) instruction. A 32×16 character text screen could be treated as an array of 16 lines of 32 bytes:

cursorFlash:
  ; A=column, B=line
  pshs a,b,x
  cmpa #32
  bhs out
  cmpb #16 
  bhs out
  ldx #textScreen
  leax a,x        ; X:=X+A
  lda #32
  mul             ; D:=B*32
  leax d,x        ; X:=X+D  
  lda ,x          ; A:=textScreen[A,B]
  eora #$40
  sta ,x
out:    
  puls a,b,x,pc

How do I use strings?

Like any other complex data structure you have to do all the work yourself. There is however a convention of the “null-terminated string” which is widely used and may take little work. It's simply a list of bytes terminated by a zero. Thus a string can be easily defined and as long as you like, the only limit being it can't include any zeroes.

strDelay   fcb "Please wait.",0                 ; easy static string
strWelcome fcb "Hello %USER% and welcome.",0    ; substitution expected, so a problem
  leax strDelay,pcr
  bsr strLen
  jmp romPrint16              ; print length of strDelay
strLen:
; return length of string at X in B
  pshs x
  ldd #0
  bra strL2
strL1:  
  addd #1                      ; count a valid character
strL2:  
  tst ,x+                      ; null terminator? 
  bne strL1
  puls x,pc

This works well for sending messages to the screen, or making simple modifications such as changing to upper case. But anything more, such as simply catenating two strings, requires the programmer to ensure there is enough space available. Getting this wrong may crash the program.

An alternative structure might be to precede the string by two bytes specifying its length and bytes available:

strWelcome    fcb strWelcomeEnd-strWelcome-2, 100  ; length followed by max length
              fcb "Hello %USER% and welcome."      ; string characters (25 bytes)     
strWelcomeEnd rmb 100-strWelcomeEnd+strWelcome+2   ; spare space (75 bytes)
strSpaceEnd:

These strings are awkward to define in assembly language. The programmer must then write a library of reliable routines to manipulate this structure.

How do I define data records and arbitrary structures?

As with arrays and strings, you point an index register where you hope the structure will be, and then do all the housekeeping yourself. Assembly language can help a little by allowing us to define constants to label offsets within the structure. Clearly this allows for variant records (fields having dual meanings).

; sprite structure (8 bytes total)
spr_type EQU 0    ; 8-bit type
spr_xPos EQU 1    ; 16-bit X co-ord
spr_yPos EQU 3    ; 16-bit Y co-ord
spr_mask EQU 5    ; 8-bit collision mask 
spr_bmap EQU 6    ; 16-bit bitmap address 
spr_size EQU 8
spr_maxN EQU 20   ; room for 20 sprites
spr_table rmb spr_maxN*spr_size

sprEraseAll:
; erase all active sprites
  leau spr_table,pcr
  ldb #spr_maxN
sprEraseLoop:  
  lda ,u
  beq sprNext         ; zero type means not active
  ldx spr_xPos,u      ; X co-ord
  ldy spr_yPos,u      ; Y co-ord
  lbsr sprErase  
sprNext:
  leau spr_size,u     ; point to next sprite  
  decb                ; loop count
  bne sprEraseLoop
  rts

How do I use local variables?

The simplest way to use a local variable is to push a register onto the stack (remembering to remove it later).

shift: 
; shift register A right B times
; return result in D
  pshs b             ; B is now a local variable with address S
  clrb               ; destroy B to hold result
  tst ,s             ; was B zero?
  beq shiftOut
shiftLoop:  
  lsra               ; 16-bit shift right
  rorb
  dec ,s             ; decrease local variable
  bne shiftLoop      ; loop (original) B times 
shiftOut:  
  leas 1,s           ; remove local variable
  rts

The problem with this is that when we push more temporary values on the stack, the offset to our local variable changes. In the example above, if we chose to PSHS X then our offset now becomes 2,S . We can deal with this, but it's awkward.

A solution is to reserve space on the stack, then point to it with another register, and use this one as the offset base. The principle is the same as with the structures described above. It can be extended to allow for the passing of arbitrary parameters on the stack. For example:

PROCEDURE drawBox (left, top, right, bottom:integer);
VAR xPos, yPos:integer;
BEGIN
  xPos=left;
  yPos=right;
  REPEAT
     ...
  UNTIL yPos>=bottom
END

Becomes:

xPos   EQU -4
yPos   EQU -2
left   EQU 8
top    EQU 10
right  EQU 4
bottom EQU 6          
; 2 would be the return address
; 0 would be the saved value of U

; draw a 10x6 box at (0,2)
  ldx #0            ; left:=0  
  ldy #2            ; top:=2
  pshs x,y
  ldx #10           ; right:=10 
  ldy #8            ; bottom:=8 
  pshs x,y         
  lbsr drawBox
  leas 8,s          ; remove 4 16-bit parameters 
  rts
drawBox:  
  pshs u            ; save value of U
  leau ,s           ; U is an address between the locals and the parameters
  leas -4,s         ; reserve space for 2 16-bit integers on stack
  ldd left,u        ; U+8
  std xPos,u        ; U-4 
  ldd top,u         
  std yPos,u
drawBoxLoopY:  
  ...
  ldd yPos,u
  cmpd high,u
  blt drawBoxLoopY
  leas 4,s         ; release reserved space
  puls u,pc        ; restore U and return

Can I do recursion?

Yes, there's no reason why an assembly language subroutine shouldn't call itself. But the same caveats apply as in any other language. You must use local variables for instance, and not alter global memory in unexpected ways.

Beware of very deep recursion; a pixel flood-fill routine might call itself thousands of times, overwriting more than the available RAM with the system stack.

What is position independence and when do I use it?

Ideally a machine language program can be loaded at any address in memory and still work correctly; the only thing that has to change is the program start address. This is Position Independent Code (PIC). It is more important for smaller routines as several may be loaded at the same time.

What prevents position independence? Addresses being hard-coded rather than relative. Direct jumps to subroutines are a common example. All jumps and branches (within the program, not eg. external ROM routines) can be as easily coded using relative offsets; instead of JMP and JSR use BRA/LBRA and BSR/LBSR.

The same applies to data references. Variables stored on the stack are fine. As are pointers returned by the operating system. Other structures need to be referenced using PC relative addressing.

The PC (Program Counter) always contains the address of the next instruction. The same mechanism used for branching can be used for data addressing. Giving a relative offset from the PC means the actual address at which the program is loaded is irrelevant. We rarely give the actual offset, just as we don't give the number of bytes to skip in a relative branch. Instead we specify the target address, and indicate this by following it with “,PCR”.

frameCount rmb 1
frameList rmb 20
  lda frameCount         ; direct extended addressing
  lda frameCount,pcr     ; same address calculated via PC relative
                         ; the assembler calculates the offset
                         ; NOT the same as PC+frameCount
  inca
  ldb -27,pc             ; PC-27 - this form is rarely useful
  ...
  ldx #frameList         ; initialise index using immediate mode
  leax frameList,pcr     ; X has the same value as above
  ldy ,x++               ; and the routine can proceed just as before

Not all addresses should be accessed via PCR; only the ones that move with the program. This means NOT I/O ports, screen buffers, ROM routines etc.

There are downsides to PCR, mainly that it is less efficient, taking more bytes and machine cycles to do the same task. Sometimes it can be more complex to program:

bufSize EQU 100
buffer rmb bufSize     
bufEnd:

bufClear:
  leax buffer,pcr
bufCloop:  
  clr ,x+
  cmpx #???            ; er, how to compare X with end of buffer space?
  blo bufCloop
  
bufClear:  
  leax bufEnd,pcr      ;
  pshs x               ; stack address of end of buffer space
  leax buffer,pcr
bufCloop:  
  clr ,x+
  cmpx ,s              ; compare X with end of buffer space
  blo bufCloop
  leas 2,s             ; tidy stack

An alternative to PCR addressing is direct paged addressing; copying PC to DP takes just a few steps. This is both time and space efficient, but limits the program to start on 256 byte pages. It's a good solution for large programs such as arcade games.

org $4000
start:
  lbsr main         ; stack PC
counter rmb 1
variables rmb 100  
main: 
  puls d            ; retrieve old PC (points to 'counter')
  tfr a,dp          ; high byte of PC goes to DP
  setdp $40         ; directive to help the assembler
  clr counter       ; most assemblers will now use DP automatically

In general, you should always use position independent code for programs intended to have any sort of longevity. The exceptions are special cases such as embedded systems, ROM cartridges, and plain quick-and-dirty mash-ups.

What are interrupts?

An interrupt is a physical signal that can occur at any time, triggering the CPU into stacking its registers, executing a subroutine, then retrieving its registers to continue as normal. This is a way of providing a software response to a hardware event. It may be as simple as updating a counter in memory in sync with an external timer.

The details of interrupts are intricate and a relatively advanced topic, but most home micros have a simple use for them: vertical blanking. On an old-fashioned CRT, every time the display of a graphics frame is complete the circuitry issues an interrupt. Syncing with this allows for smooth animation in games.

gameLoop:
  sync                  ; wait for interrupt
  lbsr eraseSprite
  lbsr moveSprite
  lbsr drawSprite       ; most of the time is spent with the sprite drawn, not erased
  bra gameLoop

The 6809 has four hardware interrupts (IRQ, FIRQ, NMI, and RESET), and three software interrupts (SWI, SWI2, SWI3), each with its own vector pointing to a service routine. Software interrputs are machine language instructions rather than hardware events, and typically are used for debugging breakpoints or calling operating system routines.

An interrupt service routine must end with the instruction RTI (Return from Interrupt) so that the registers are unstacked correctly. This first pulls the CC and checks the E (Entire) bit. If set it then pulls all the registers from the stack, if clear it just pulls the PC. The fast interrupt (FIRQ) doesn't stack the working registers, and the E bit is the way of flagging the difference.

Two other CC bits are involved with interrupts. Setting I disables (or masks) the IRQ response, and setting F disables the FIRQ response.

What is reentrant code and how do I write it?

A reentrant machine language program can be very roughly treated; you can stop it (saving its registers), restart it from the beginning, stop it again, and still have it continue correctly where it left off (restoring its registers of course). This is exactly what is needed for a multi-tasking operating system: the same code used by multiple processes.

The concept is related to interrupts and recursion. Whereas a recursive routine secures its variable so it's safe to call itself from just one point, a reentrant routine must try harder. Basically, it must allocate itself a new set of variables whenever it runs.

There's no reason why a short routine shouldn't be reentrant. Simply allocate variables on the stack and never use globals. And avoid eccentricities like self-modifying code. For a larger program things become complex. Fresh memory might be allocated from a heap by the operating system for example. It's an advanced topic, but still doable on the 6809.

What is indirection and how do I use it?

The standard way of accessing a variable in memory is via direct extended addressing. A 16-bit address is given, and an 8 or 16 bit operand is read from this 'effective address (EA)'. Indirect addressing adds an extra step: the 16-bits at the initial EA are read and used as the true EA of the operand. Put square brackets around the memory expression to use indirection (some assemblers allow the more usual curved brackets).

CURSPOS rmb 2

  ldx #$0400    ; initialise cursor position
  ldd #$4849    ; two character message
  std ,x        ; print it 
  stx CURSPOS   ; save cursor position

  ldb [CURSPOS] ; what character is at the cursor position?
  rts

As well as direct extended mode, indirection can be used on PC relative and indexed addressing modes too. But note that because we are always specifying a 16-bit addresss, auto-increment/decrement by just one (rather than two) cannot be used.

Another example: suppose we are using strings with a data structure of a length byte followed by the ASCII codes of <length> characters. We can now iterate through an array of the addresses of such strings to access the length values of each one in turn, perhaps to zero them for example.

strCount EQU 3    ; 3 strings
strTable fdb str1, str2, str3

start:   
  leax strTable,pcr   ; point to table
  ldb #strCount
  bsr longest         ; find length of longest string 
  rts
  
longest:
  pshs b,x
  clr ,-s             ; reserve and zero value of longest strng so far
loop:
  lda [,x++]          ; read EA from X, then inc X by 2, then load A from EA
  cmpa ,s             ; is A higher than longest?
  bls short           ; skip if not
  sta ,s              ; up-date longest
short:        
  decb                ; for B strings
  bne loop
  puls a,b,x,pc       ; return longest value in A   

str1 fcb 11,"Hello World"
str2 fcb 13,"Hows it going"
str3 fcb 12,"Wotcha folks"

How useful is this? Not very. We can save the use of an index register, but it's still slow, and very limited in terms of what we can do with a single address. Indirection seems powerful but has fallen out of fashion; it's a hangover from the early days of micros. The 6502 for example had no 16-bit index registers, instead relying on 16-bit addresses stored in zero page (8-bit addressable) memory.

Where indirection is truly useful is with jump tables. For example we might have a library of subroutines for system basics such as input/output and memory allocation. We can compile a list of addresses of these routines into a table. Our program can now access library routines only through the table. This makes things more flexible as the routines in the library can be changed as long as the table is updated in accordance; our program need not change at all.

sprLibraryTable fdb sprInit, sprErase, sprDraw, sprMove, sprKill

; call third routine (+0, +2, +4 bytes from start) in table
  jsr [sprLibraryTable+4]     ; draw a sprite

  ldb #2                      ; use an index (0, 1, 2) into the table
  leay sprLibraryTable,pcr  
  aslb                        ; double B to 4 
  jsr [b,y]                   ; draw a sprite
  
sprInit rts
sprErase rts
sprDraw rts
sprMove rts
sprKill rts

This can be turned into a powerful mechanism.

How do I use virtual methods to create polymorphic objects?

Importing high-level concepts into assembly language is something one can easily overdo. But there's no reason why we can't cherry pick some nice features.

Suppose we have a sprite library that deals with simple movement and animation, and want to expand it to include features such as collision detection, keeping in bounds, and more. To our sprite object we can add a field SPRMETHODS to point to a table of subroutines. Thus every sprite instance can have its own dynamic type specification.

We can create a heirarchical structure where one class inherits its methods from its parent. So for a game we might start with:

Basic: A movable image with just a single frame that interacts with nothing.
–Animated: As Basic, but with a framelist displaying animated images
—-Morpher: As Animated, but morphing when a counter runs down, to turn into a bonus drop for example

; Structure offsets

sprStatus EQU 0
sprMask EQU 1
sprXPos EQU 2
sprYPos EQU 4
sprGraphic EQU 6
sprMethods EQU 8
sprCounter EQU 10  
sprMorphTo EQU 12

; Subroutine library

sprInitBasic rts
sprEraseBasic rts
sprDrawBasic rts
sprMoveBasic rts
sprDoneBasic rts
sprDrawAnimated rts
sprDoneMorpher rts

; Class definitions:  Basic - - > Animated - - > Morpher
 
sprMethodsBasic     fdb sprInitBasic, sprEraseBasic, sprDrawBasic, sprMoveBasic, sprDoneBasic
sprMethodsAnimated  fdb sprInitBasic, sprEraseBasic, sprDrawAnimated, sprMoveBasic, sprDoneBasic
sprMethodsMorpher   fdb sprInitBasic, sprEraseBasic, sprDrawAnimated, sprMoveBasic, sprDoneMorph

; Instance templates

cloud fdb 0,0,0,bmCloud,0,sprMethodsBasic
explosion fdb 0,0,0,flExplosion,60,sprMethodsAnimated
bonus fdb 0,0,0,bmBonus,0,sprMethodsBasic
bonusDrop fdb 0,0,0,flExplosion,30,sprMethodsMorph, bonus

Note that all the organisational work is down to the programmer; the assembler won't give any help keeping the structures in order.

How do negative numbers work?

Think of an old-fashioned clock face as an analogy with an 8-bit register. An hour has 60 minutes, marked as 0 to 59. The minute hand goes only one way, and when it come round to zero the hour hand advances.

An 8-bit register has 256 values, marked as 0 to 255, and when it overflows round to zero the carry flag is set.

How do negative numbers work on a clock? It's entirely down to how we choose to read them. We can say “to” the hour or “past” the hour. The mechanism of the clock doesn't have to change at all. Ten to eight in the morning is 07:50. So “minus ten” in clockface terms is calculated by “60-10”. To turn the clock back another 10 minutes, we can instead turn it forward 50 to get the correct answer of 20 to the hour.

In binary the principle works just the same, only (for an 8-bit value) we subtract from 256 instead of 60, ie. “minus ten” equals “256-10”, being 246, or in hex $f6. Then if we see 246 and want to see the “minus” variant, we do the same, subtract it from 256.

Which signed numbers are positive and which negative? Highest bit set means negative, so unsigned values 128 to 255 represent -128 to -1. Positive signed values range from 0 to 127. This is called “two's-complement” format, because to negate we can flip (or COMplement) all the bits and add 1 to the result.

This format is chosen because the underlying circuitry doesn't need to change; it's simply down to the programmer to keep track of which variables are seen as signed and unsigned. There are two flags in the condition codes to help with signed numbers: the N flag is set whenever bit 7 of a result is set, and V is set when a signed result overflows. An overflow is an unintended sign change; to use our clock analogy it's when the minute hand passes through the half-hour without going through the hour first.

Which instructions deal with signed numbers? There's SEX (Sign EXtend) which changes a signed value in B to a signed value in D by simply filling A with 8 copies of bit 7 of B. To negate an 8-bit value use NEG. All the addition and subtraction instructions work with either interpretation, but note that MUL is always unsigned. The conditional branches (also having long variants) are:

Signed branches    Meaning             Unsigned equivalent

     BPL             PLus                    
     BMI             MInus
     BLT             Less-Than                BLO
     BLE             Less-than or Equal       BLS
     BGT             Greater-Than             BHI
     BGE             Greater-than or Equal    BHS
     
Miscellaneous instructions
  
NEG (8-bit only)     NEGate                   COM
SEX                  Sign EXtend B to D
COMA; COMB; ADDD #1  negate D

Note that other than the simple BPL/BMI, the signed variants are not commonly used. If you do choose one, think twice whether you meant the unsigned variant. Signed comparisons should not be used with addresses for example.

What do all the instructions do?

[Will get round to this one]

Table of Contents

Assembly Language FAQs

Why are there so many instructions?

Which registers do I use?

How do i declare, assign, and modify variables?

What is a bigendian?

What does the LEA instruction do?

How do I call a subroutine?

What is a stack and how do I (safely) use it?

How do I do something a hundred times over?

How do I write an IF / THEN statement?

How do I write a REPEAT / UNTIL loop?

How do I write a WHILE loop?

How do I access an array?

How do I use strings?

How do I define data records and arbitrary structures?

How do I use local variables?

Can I do recursion?

What is position independence and when do I use it?

What are interrupts?

What is reentrant code and how do I write it?

What is indirection and how do I use it?

How do I use virtual methods to create polymorphic objects?

How do negative numbers work?

What do all the instructions do?