Introduction to Computer Organization

ISA: The LC2k and ARM architectures

Two's Complement

There needs to be some way to represent negative numbers in a processor. There are a few goals to do so:

Keep old representation of positive numbers compatible
Single representation of 0 (negative zero? positive zero?)
Equal proportion of positive and negative numbers
Easy detection of sign and easy negation

Two's complement achieves this.

How to negate binary numbers

Simply, flip all of the bits and add one to the number.

Qualities of Two's Complement

All negative numbers in binary have a most significant bit with value 1.
Range:

$$[-2^{b-1}, 2^{b-1} -1]$$

Sign Extension

Often, in a processor, you want to move a value stored in a smaller register (such as a 4-bit register), to a larger register (say 8 bits). Obviously, you wouldn't want to change the value. To do this, you pad the number with the most significant bit.

With a positive number, you pad with zeroes.
- Ex. 0011 goes to 00000011.
With a negative number, you pad with ones.
- Ex. 1101 goes to 11111101.

Assembly and Machine Code

As a recap, an ISA defines a set of instructions that programmers can use to make the processor do what they want. Computers store instructions the same way they store data: as strings of ones and zeros.

Each instruction is encoded in memory, and the converted to a number.

Storage locations:
- Registers:
  - Small array of memory in the processor
  - Fast
  - Direct addressing only
  - Special register: PC register
    - Program counter
    - Register that is inside the CPU
    - Stores the address for where you can get the next instruction in memory.
    - Who initializes the program counter? The BIOS. The very first instruction that gets stored in the processor is the location of the first instruction itself.
- Memory (DRAM):
  - Large array of storage locations
  - Slow access
  - Many addressing modes
    - Direct, indirect, base + displacement
- Moved between registers and memory using load and store commands

Addressing Modes

Tons of options, because it makes it easier for the programmer to navigate through gigabytes of memory.

Direct addressing

Specifying the address directly. Address is specified in the instruction, almost as an immediate operand to the instruction.

To move the value in memory at 1500 to r1:

load r1, M[1500]

Branch instructions also need to specify to move to a different instruction: jump M[3000]. Basically a goto command.
Useful for addressing locations that don't change during execution.
- Global/static variables
- Branch target addresses

Indirect addressing

Specify an immediate address, load the value from that immediate address, and then load the value from that reference address.

load r1, M[ M[1900] ]

Very similar to pointers in C.

Register indirect

Specify which register has the reference address.

load r1, M[r2]

Very similar to an array in C. Also, helpful because it is faster than indirect addressing.

Base + displacement

Similar to register indirect, but with an offset which is an immediate constant which is stored in an instruction.

load r1, M[r2 + 1000]

Good for accessing class objects/structures. Say you want to address every second field of the following structure:

struct ints {
    int a;
    int b;
}

If r1 had the address of the struct, then r1 + 4 would give you the address of the b field. If you wanted to skip the whole struct, you could say r1 + 8 since both ints take up 4 bytes in memory.

PC-relative

A variant on base + displacement. Addressing relative to your program counter (PC). For example, consider the following instruction:

jump [-8]

This would mean that the program counter jumps ahead two instructions. This would be useful in the following if loop:

if (x) {
    // one instruction
    // another
} else {
    // the rest that you actually want
}

LC2k Architecture

LC2k is a more simplified architecture.

32 bit processor, so instructions are 32 bits
8 bit registers
Supports 65536 words of memory (addressable space)
8 instructions
- and (and)
- nand (not and)
- lw (load word)
- sw (store word)
- beq (branch equal)
- jalr
- halt (end program)
- noop (no operation)

Instruction Encoding

Many unused bits and opcode, source registers, and destination registers.
Positional organization of bits.
Two templates:
- R type instructions (add, and nand)
- I type instructions (lw, sw, beq)

ARM Architecture

Three main types of instructions:

Arithmetic and logic
Memory access
Sequencing

Limitation of assembly language: the number of source operands has a max of 2. This means you have to break down a complex expression down into simpler ones.

f = (g + h) - (i + j)

Where g = r3, h = r4, i = r5, j = r6. This is a way to do the addition:

add r1, r3, r4
add r2, r5, r6
sub r7, r1, r2

In many commands, such as add, the second operand can either be a register OR an immediate.

Some interesting commands:

LSR: logical shift right (also there is one for left).
ROR: moves the value to a 32-bit register, and shifts by 8 bits. Used for expressing numbers greater than 8 bits.