Introduction to Computer Organization

ISA: From C to Assembly

ARM memory instructions

Supports base + displacement, and base + register

  • Offsets can either be postiive or negative
  • Format: 2 registers and one immediate or register
ldr r3, [r4, #1000]

or

ldr r3, [r4, r1]

Load instruction sizes

In LC2k, when you load something, it has to be a word (32 bits). However, in ARM, this isn't the case:

  • ldr loads a full word (32 bits).
  • ldrh loads half of a word, or a halfword (16 bits).
  • ldrb loads a byte (8 bits).

However, since the registers are 32 bits, how do you pad the numbers loaded from ldrh and ldrb?

  • One option: sign extension. Pad it with the most significant bit.
  • Another option: zero extension. This only works if the load is unsigned.
Signed Values
  • ldrsb: Load signed byte
  • ldrhb: Load signed halfword

Note: you don't care about sign extension for stores. Sign extension only matters when you move something from a smaller location to a larger one. This doesn't happen in store commands.

  • strb: Stores the least significant 8 bits of a register.
  • strh: Stores the least significant 16 bits of a register.

Endian-ness

Endian-ness is the ordering of bytes.

It is used to clarify the problem of storing sequential bytes in memory in a word. Should you order them from MSB to LSB, or the other way around?

  • Little-endian: Least significant byte first. Increasing numeric significance with increasing memory addresses.
    • x86 and ARM follow this convention
  • Big-endian: Most significant byte first. Decreasing numeric significance with increasing memory address.
    • The Internet

Converting C to Assembly

Questions to answer:

  • How should we represent data structures in memory
  • How should we represent branches?

Assembly language must have the ability to store data structures and access them.

Arrays

Take this C code for example:

a = b + names[i];

Assume that a is in r1, b is in r2, i is in r3, and the array names starts at the address 1000 and holds 32 bit integers.

For all the memory knows, the memory is just a sea of data. Only the programmer makes the distinction between which data is important or not.

The natural way to store an array in memory is to store one element after another in memory. For example, if names[0] is at the base address 1000, should be at address 1004.

names[i] // is actually equal to the address at (names + i*4)

Structures

Take this C code:

class {
    int a;
    unsigned char b, c;
} y;

y.a = y.b + y.c;

Assume that a pointer to y is in r1, and the address of y is zero.

  • The address of a is zero, since it is the beginning of the structure.
  • The address of b is 4, since the amount of bytes taken up by a is 4.
  • The address of c is 6, since the amount of bytes taken up by b is 6.

Wait... what? Shouldn't chars take up one byte?

Memory layout of variables

For ARM, cannot always arbitrarily pack variables into memory, because you need to worry about alignment. This is because often you have to divide a large chunk of memory into blocks called pages. In these blocks, the variable size is \(2^x\). In a cache block, \(x\) can be 4 or 5. In pages, \(x\) is often 10 or 12. It is important that the data variable does not span multiple blocks. If this were the case, it would significantly increase the complexity of the hardware.

  • Golden rule – the address of a variable is aligned based on the size of the variable.
    • The char is byte aligned, so the (address % 1 byte) == 0. You can store a char anywhere.
    • The short is half-word aligned, so the (address % 2 bytes) == 0. You can only store a short in even addresses. The LSB of the address must be 0.
    • The int is word aligned, so the (address % 4 bytes) == 0. You can only store an int in addresses that have the two LSBs of 0.

Structure and Class Alignment

Each field has the order it is declared using the Golden Rule for alignment

  • Starting address of the overall struct is aligned based on the largest field
  • Size of overall struct is a multiple of the largest field

This allows us to have an array of structs.

For example:

struct {
    char c;
    int b;
} A;

Therefore, since the largest field is an int, the address of c must start at an address that is divisible by 4. If this weren't the case, c could be at address 1, then b would be stored at address 5. This would clearly not work.

Since the memory this takes up is 6 bytes, and the size of the struct must be a multiple of 4, you pad with two useless bytes and get a total of 8 bytes. This is so that each struct doesn't interfere with the one next to it.

Exercise

How much memory is required for the following data, assuming that the data starts at the address 200?

int a;

struct {
    double b;
    char c;
    int d;
} e;

char *f;
short g[20];
  • Memory for a: 4 bytes (200-203)
  • Padding: 4 bytes (204-207)
  • Memory for e: 16 bytes (208-219)
    • 8 for b (208-215)
    • 1 for c (216)
    • 3 padding bits (217-219)
    • 4 for d (220-223)
  • Memory for f: 4 bytes (224-227)
    • Pointers are always of size 4 bytes in a 32-bit architecture.
  • Memory for g: 2 bytes (227-228)

ARM Sequencing Instructions

Sequencing changes the flow of instructions that are achieved. This is by modifying the program counter (r15).

if (condition_test) goto target_address
  • condition_test examines flags from the processor status word (PSR)
  • target_address is a 24-bit word displacement on the current PC + 8.
cmp 41, 42
beq

Condition Codes

  • Determines the direction of branch
  • Four primary codes evaluated:
    • N: set if the result is negative
    • Z: set if the result is zero
    • C: set if there was a carry of a bit
    • V: set if the last operation had an overflow

Setting the Branch Displacement Field

You want something like this:

if (cond)
    PC = PC + some offset
else
    PC = PC + 4

Determine:

$$\text{Target} = PC + 8 + 4*\text{24_bit_signed_displacement}$$

Therefore:

  • beq 1 branches 3 instructions ahead if flag Z == 1.
  • beq -3 branches 1 instruction back if flag Z == 1.
  • beq -2 branches to the same place – an infinite loop if Z == 1.

Other Branching Instructions

  • bne offset: branch not equal
  • blt offset: branch less than
  • bge offset: branch greater than
  • b offset: an unconditional jump
  • mov r15, r3: jump to address in r3. Useful for function pointers and switch statements.
  • bl offset