Welsh Centre for Printing and Coatings
Tuesday November 12, 2024
So far in this course a number of topics have been covered including
In this lecture we will study assembly language which is a low level language that provides a one-to-one mapping between mnemonic instructions and the machine code that is executed on the microcontroller.
This will allow you to see how complex high-level instructions and functions in C are written in assembly language and the final program that gets transferred to the microcontroller.
In this lecture we will present an introduction to assembly language, including program structure and syntax as well as operation classifications.
We will also revisit the Direct Port Manipulation in C example from Digital I/O Example Program and translate this into assembly language looking at some of the key instructions involved.
Programs are stored on a microcontroller as a series of binary codes located within sequential memory addresses …
… this is known as machine code1.
The program in our microntroller looks like this
1000101010110001 1000001101111111 1000101010111001
1000010010110001 1000001101100000 1000010010111001
1000010110110001 1000110001111111 1000010110111001
Well nothing really … if you are a computer!
Otherwise … if you’re a human…3
… machine code is difficult to:
and most importantly
Instead, instructions can be written in a mnemonic form termed assembly language and then translated into machine code by an assembler.
Every CPU (or family of CPUs) has an instruction set where each operation that can be performed is represented by a certain binary combination.
The next step up in language levels is to represent each of these binary patterns with a short mnemonic.
Programs written using these mnemonics are known as assembly language programs4.
Assembly languages were first developed in the 1950s and were referred to as 2nd generation programming languages.
Assembly language is a low level language that uses mnemonic codes (symbols) to represent machine code instructions, rather than using the instructions’ numeric (binary) values.
Essentially, assembly languages are a much more readable but directly translatable representation of machine code.
Assembly language is commonly called just assembly, ASM, or symbolic machine code.
Despite the giant leap from machine code to assembly language, by the 1980s its use had largely been overtaken by higher-level languages, such as Fortran and C, and more recently Python, for many applications.
In short:
An assembly language program consists of a series of instructions to an assembler which will then produce the machine code program that is loaded to the microcontroller.
A program is written as a sequence of statements - one statement per line:
Each statement contains up to four fields each separated by one or more space or tab characters as shown below:
The label field is used to create a reference point in the program than can be used to identify/locate a collection of instructions.
The operator field contains either an assembly directive or a mnemonic/instruction.
Assembly directives, sometimes termed pseudo-operations are directives to the assembler that will not be translated to machine code but provide information critical to the program’s function or is required by the assembler.
A mnemonic is an instruction that will be directly translated into machine code and is used to manipulate data in some way.
The list of allowed mnemonics/instructions is called the instruction set and is specific to a particular microcontroller architecture.
However in general, the mnemonics can be classified into one of six groups:
IN
, LD
, LDI
, LDS
, MOV
, OUT
, ST
, STS
;ADD
, ADC
, ADIW
, SUB
, SUBI
, SBC
, INC
, DEC
, MUL
, MULS
, FMUL
;AND
, ANDI
, EOR
, OR
, ORI
;BREQ
, BRGE
, BRNE
, BRLO
, BRMI
, BRPL
, CALL
, JMP
, RET
, RJMP
;LSL
, LSR
, ROL
, ROR
, ASR
, SBI
, CBI
, BSET
, BCLR
;BREAK
, NOP
, SLEEP
, WDR
.The operator field contains either an assembly directive or a mnemonic/instruction.
Assembly directives, sometimes termed pseudo-operations are directives to the assembler that will not be translated to machine code but provide information critical to the program’s function that is required by the assembler.
Some common directives include;
.CSEG
/ .DSEG
/ .ESEG
.ORG
/ .EXIT
.EQU
/ .SET
/ .DEF
/ INCLUDE
.DB
/ .DW
/ .BYTE
Directives are specific to a particular microcontroller family (different to the instruction set). A list of supported directives for the AVR based microcontrollers can be found here.
NOP
(no-operation) requires no operand.,
).As with the C language, the comment field is there to allow the programmer to include any comments which may make the program easier to understand at a later time or by another reader.
When the assembler is reading the line of text, the comment field is ignored.
Comments also follow a set of rules and a particular format dependent on the assembler being used6:
The assembler processes the assembly language file and generates an object file and listing file(s)
The linker combines multiple object files as well as any library files and generates an executable which can be loaded onto the microcontroller (this file is often a *.hex
file).
int main(void)
We can use the IN
and OUT
operations for reading from and writing to ports respectively, and the ANDI
and ORI
operations for setting up bitmasks.
We include scans of the documentation for these operators in the following images.
Using the C language, we wrote:
to ensure bits 2 and 3 of port D are configured as inputs.
Similar lines were written to set up the output bits in Port B, the starting condition of these bits and then to enable the pull up resistors on Port D.
Using the C language, we created an infinite loop as follows:
This essentially “traps” the program to ensure it continuously loops executing the program code within the code block.
In assembly language we can produce the same result by creating a “Label” and using the operation RJMP
(relative jump):
The documention for RJMP
is shown here:
In assembly we can use the compare (CP
) and branch if equal (BREQ
) instructions to achieve this same implementation.
SBI
, CBI
, SBIC
, SBIS
and a handful of others:These can only be used on certain registers as identified in the documentation for the I/O memory map:
In this section:
This week on the Canvas course pages, you will find the sample program from today’s lecture, look through this and ensure you are confident in how it works and how the masks are defined and registers set.
There is also a short quiz to test your knowledge on these topics.
Programs are stored on a microcontroller as a series of binary codes located within sequential memory addresses. These instructions are executed in order, dictated by the program counter, unless an instruction modifies the program counter and changes the program flow e.g. for a function call.
Image source: www.shutterstock.com/image-vector/binary-code-digital-numbers-green-background-1724376772.
Image source www.shutterstock.com/search/confused+person.
Because each mnemonic is associated with a single machine code, it is also easy to convert machine code to assembly language. This is sometimes useful for debugging programs. It is known as disassembly.
Please don’t get carried away! There are few tasks for which the cost of not starting in a HLL is going to be paid back by writing complete programs in assembly code.
As assembly codes are generally less readable than higher level languages, like the C language, it is good practice to be very liberal with comments in assembly code programs.
The purpose of these assembly directives is to assign a meaningful name to a label, constant, data value (variable), or memory location.