Supports base + displacement, and base + register
ldr r3, [r4, #1000]
or
ldr r3, [r4, r1]
In LC2k, when you load something, it has to be a word (32 bits). However, in ARM, this isn't the case:
ldr
loads a full word (32 bits).ldrh
loads half of a word, or a halfword (16 bits).ldrb
loads a byte (8 bits).However, since the registers are 32 bits, how do you pad the numbers loaded from ldrh
and ldrb
?
ldrsb
: Load signed byteldrhb
: Load signed halfwordNote: you don't care about sign extension for stores. Sign extension only matters when you move something from a smaller location to a larger one. This doesn't happen in store commands.
strb
: Stores the least significant 8 bits of a register.strh
: Stores the least significant 16 bits of a register.Endian-ness is the ordering of bytes.
It is used to clarify the problem of storing sequential bytes in memory in a word. Should you order them from MSB to LSB, or the other way around?
Questions to answer:
Assembly language must have the ability to store data structures and access them.
Take this C code for example:
a = b + names[i];
Assume that a
is in r1
, b
is in r2
, i
is in r3
, and the array names
starts at the address 1000 and holds 32 bit integers.
For all the memory knows, the memory is just a sea of data. Only the programmer makes the distinction between which data is important or not.
The natural way to store an array in memory is to store one element after another in memory. For example, if names[0]
is at the base address 1000, should be at address 1004.
names[i] // is actually equal to the address at (names + i*4)
Take this C code:
class {
int a;
unsigned char b, c;
} y;
y.a = y.b + y.c;
Assume that a pointer to y
is in r1
, and the address of y
is zero.
a
is zero, since it is the beginning of the structure.b
is 4, since the amount of bytes taken up by a
is 4.c
is 6, since the amount of bytes taken up by b
is 6.Wait... what? Shouldn't chars take up one byte?
For ARM, cannot always arbitrarily pack variables into memory, because you need to worry about alignment. This is because often you have to divide a large chunk of memory into blocks called pages. In these blocks, the variable size is \(2^x\). In a cache block, \(x\) can be 4 or 5. In pages, \(x\) is often 10 or 12. It is important that the data variable does not span multiple blocks. If this were the case, it would significantly increase the complexity of the hardware.
char
is byte aligned, so the (address % 1 byte) == 0. You can store a char
anywhere.short
is half-word aligned, so the (address % 2 bytes) == 0. You can only store a short
in even addresses. The LSB of the address must be 0.int
is word aligned, so the (address % 4 bytes) == 0. You can only store an int
in addresses that have the two LSBs of 0.Each field has the order it is declared using the Golden Rule for alignment
struct
is aligned based on the largest fieldstruct
is a multiple of the largest fieldThis allows us to have an array of struct
s.
For example:
struct {
char c;
int b;
} A;
Therefore, since the largest field is an int
, the address of c
must start at an address that is divisible by 4. If this weren't the case, c
could be at address 1, then b
would be stored at address 5. This would clearly not work.
Since the memory this takes up is 6 bytes, and the size of the struct must be a multiple of 4, you pad with two useless bytes and get a total of 8 bytes. This is so that each struct
doesn't interfere with the one next to it.
How much memory is required for the following data, assuming that the data starts at the address 200?
int a;
struct {
double b;
char c;
int d;
} e;
char *f;
short g[20];
a
: 4 bytes (200-203)e
: 16 bytes (208-219)b
(208-215)c
(216)d
(220-223)f
: 4 bytes (224-227)g
: 2 bytes (227-228)Sequencing changes the flow of instructions that are achieved. This is by modifying the program counter (r15).
if (condition_test) goto target_address
condition_test
examines flags from the processor status word (PSR)target_address
is a 24-bit word displacement on the current PC + 8.cmp 41, 42
beq
You want something like this:
if (cond)
PC = PC + some offset
else
PC = PC + 4
Determine:
$$\text{Target} = PC + 8 + 4*\text{24_bit_signed_displacement}$$
Therefore:
beq 1
branches 3 instructions ahead if flag Z == 1.beq -3
branches 1 instruction back if flag Z == 1.beq -2
branches to the same place – an infinite loop if Z == 1.bne offset
: branch not equalblt offset
: branch less thanbge offset
: branch greater thanb offset
: an unconditional jumpmov r15, r3
: jump to address in r3. Useful for function pointers and switch statements.bl offset