all, ! , ARM(1)
[ Pobierz całość w formacie PDF ]
Title Page
1 of 1
 ARM Assembly Language Programming - Chapter 1 - First Concepts
1. First Concepts
Like most interesting subjects, assembly language programming requires a little
background knowledge before you can start to appreciate it. In this chapter, we explore
these basics. If terms such as two's complement, hexadecimal, index register and byte are
familiar to you, the chances are you can skip to the next chapter, or skim through this one
for revision. Otherwise, most of the important concepts you will need to understand to
start programming in assembler are explained below.
One prerequisite, even for the assembly language beginner, is a familiarity with some
high-level language such as BASIC or Pascal. In explaining some of the important
concepts, we make comparisons to similar ideas in BASIC, C or Pascal. If you don't have
this fundamental requirement, you may as well stop reading now and have a bash at
BASIC first.
1.1 Machine code and up...
The first question we need to answer is, of course, 'What is assembly language'. As you
know, any programming language is a medium through which humans may give
instructions to a computer. Languages such as BASIC, Pascal and C, which we call high-
level languages, bear some relationship to English, and this enables humans to represent
ideas in a fairly natural way. For example, the idea of performing an operation a number
of times is expressed using the BASIC
FOR
construct:
FOR i=1 TO 10 : PRINT i : NEXT i
Although these high-level constructs enable us humans to write programs in a relatively
painless way, they in fact bear little relationship to the way in which the computer
performs the operations. All a computer can do is manipulate patterns of 'on' and 'off',
which are usually represented by the presence or absence of an electrical signal.
To explain this seemingly unbridgable gap between electrical signals and our familiar
FOR...NEXT
loops, we use several levels of representation. At the lowest level we have our
electrical signals. In a digital computer of the type we're interested in, a circuit may be at
one of two levels, say 0 volts ('off') or 5 volts ('on').
Now we can't tell very easily just by looking what voltage a circuit is at, so we choose to
write patterns of on/off voltages using some visual representation. The digits 0 and 1 are
used. These digits are used because, in addition to neatly representing the idea of an
absence or presence of a signal, 0 and 1 are the digits of the binary number system, which
is central to the understanding of how a computer works. The term binary digit is usually
abbreviated to
bit
. Here is a bit: 1. Here are eight bits in a row: 11011011
1 of 20
 ARM Assembly Language Programming - Chapter 1 - First Concepts
Machine code
Suppose we have some way of storing groups of binary digits and feeding them into the
computer. On reading a particular pattern of bits, the computer will react in some way.
This is absolutely deterministic; that is, every time the computer sees that pattern its
response will be the same. Let's say we have a mythical computer which reads in groups
of bits eight at a time, and according to the pattern of 1s and 0s in the group, performs
some task. On reading this pattern, for example
10100111
the computer might produce a voltage on a wire, and on reading the pattern
10100110
it might switch off that voltage. The two patterns may then be regarded as instructions to
the computer, the first meaning 'voltage on', the second 'voltage off'. Every time the
instruction 10100111 is read, the voltage will come on, and whenever the pattern 10100110
is encountered, the computer turns the voltage off. Such patterns of bits are called the
machine code of a computer; they are the codes which the raw machinery reacts to.
Assembly language and assemblers
There are 256 combinations of eight 1s and 0s, from 00000000 to 11111111, with 254 others
in between. Remembering what each of these means is asking too much of a human: we
are only good at remembering groups of at most six or seven items. To make the task of
remembering the instructions a little easier, we resort to the next step in the progression
towards the high -level instructions found in BASIC. Each machine code instruction is
given a name, or
mnemonic
. Mnemonics often consist of three letters, but this is by no
means obligatory. We could make up mnemonics for our two machine codes:
ON
means 10100111
OFF
means 10100110
So whenever we write
ON
in a program, we really mean 10100111, but
ON
is easier to
remember. A program written using these textual names for instructions is called an
assembly language program, and the set of mnemonics that is used to represent a
computer's machine code is called the assembly language of that computer. Assembly
language is the lowest level used by humans to program a computer; only an incurable
masochist would program using pure machine code.
It is usual for machine codes to come in groups which perform similar functions. For
2 of 20
 ARM Assembly Language Programming - Chapter 1 - First Concepts
example, whereas 10100111 might mean switch on the voltage at the signal called 'output
0', the very similar pattern 10101111 could mean switch on the signal called 'output 1'.
Both instructions are '
ON
' ones, but they affect different signals. Now we could define two
mnemonics, say
ON0
and
ON1
, but it is much more usual in assembly language to use the
simple mnemonic
ON
and follow this with extra information saying which signal we want
to switch on. For example, the assembly language instruction
ON 1
would be translated into 10101111, whereas:
ON 0
is 10100111 in machine code. The items of information which come after the mnemonic
(there might be more than one) are called the
operands
of the instruction.
How does an assembly program, which is made up of textual information, get converted
into the machine code for the computer? We write a program to do it, of course! Well, we
don't write it. Whoever supplies the computer writes it for us. The program is called an
assembler. The process of using an assembler to convert from mnemonics to machine code
is called assembling. We shall have more to say about one particular assembler - which
converts from ARM assembly language into ARM machine code - in Chapter Four.
Compilers and interpreters
As the subject of this book is ARM assembly language programming, we could halt the
discussion of the various levels of instructing the computer here. However, for
completeness we will briefly discuss the missing link between assembly language and,
say, Pascal. The Pascal assignment
a := a+12
looks like a simple operation to us, and so it should. However, the computer knows
nothing of variables called
a
or decimal numbers such as 12. Before the computer can do
what we've asked, the assignment must be translated into a suitable sequence of
instructions. Such a sequence (for some mythical computer) might be:
LOAD a
ADD 12
STORE a
Here we see three mnemonics,
LOAD
,
ADD
and
STORE
.
LOAD
obtains the value from the place
we've called
a
,
ADD
adds 12 to this loaded value, and
STORE
saves it away again. Of course,
this assembly language sequence must be converted into machine code before it can be
obeyed. The three mnemonics above might convert into these instructions:
3 of 20
 ARM Assembly Language Programming - Chapter 1 - First Concepts
00010011
00111100
00100011
Once this machine code has been programmed into the computer, it may be obeyed, and
the initial assignment carried out.
To get from Pascal to the machine code, we use another program. This is called a compiler.
It is similar to an assembler in that it converts from a human-readable program into
something a computer can understand. There is one important difference though: whereas
there is a one-to-one relationship between an assembly language instruction and the
machine code it represents, there is no such relationship between a high-level language
instruction such as
PRINT "HELLO"
and the machine code a compiler produces which has the same effect. Therein lies one of
the advantages of programming in assembler: you know at all times exactly what the
computer is up to and have very intimate control over it. Additionally, because a compiler
is only a program, the machine code it produces can rarely be as 'good' as that which a
human could write.
A compiler has to produce working machine code for the infinite number of programs that
can be written in the language it compiles. It is impossible to ensure that all possible high-
level instructions are translated in the optimum way; faster and smaller human-written
assembly language programs will always be possible. Against these advantages of using
assembler must be weighed the fact that high -level languages are, by definition, easier for
humans to write, read and debug (remove the errors).
The process of writing a program in a high-level language, running the compiler on it,
correcting the mistakes, re-compiling it and so on is often time consuming, especially for
large programs which may take several minutes (or even hours) to compile. An alternative
approach is provided by another technique used to make the transition from high-level
language to machine code. This technique is know as interpreting. The most popular
interpreted language is BASIC.
An interpreted program is not converted from, say, BASIC text into machine code. Instead,
a program (the interpreter) examines the BASIC program and decides which operations to
perform to produce the desired effect. For example, to interpret the assignment
LET a=a+12
in BASIC, the interpreter would do something like the following:
4 of 20
Â
[ Pobierz całość w formacie PDF ]