Data Structures and Algorithms

Recursion and the Master Theorem

As an introduction to recursion, consider the factorial function:

int factorial(int n) {
    if (n == 0)
        return 1;
    return n * factorial(n - 1);
}

The time complexity of this is:

$\begin{equation} T(n)=\begin{cases} c_0, & \text{if $n = 0$}.\\ T(n-1) + c_1, & \text{if $n > 0$}. \end{cases} \end{equation}$

In this case,

$T(n) = T(n - 1) + c_1$ $T(n-1) = (T(n - 2) + c_1) + c_1$ $T(n-2) = ((T(n - 3) + c_1) + + c_1) + c_1$ $\dots$ $T(n-k) = T(n - k) + kc_1$ $T(n) = nc_1 + c_2$

This growth function is on the order $O(n)$.

Tail Recursion

When a function gets called, it gets a stack frame, which stores the local variables. Also, each recursive call generates another stack frame for each recursive call!

A function is tail recursive if there is no pending computation at the end of the recursive step.

Example of a normal recursive function:

int factorial(int n) {
    if (n == 0)
        return 1;
    else
        return n * factorial(n - 1);
}

Contrast this with the tail recursive implementation:


int factorial(int n, int result = 1) {
    if (n == 0)
        return result;
    else
        return factorial(n - 1, result * n);
}

Counting Steps (Declarations and Calls)

int myFunc(          // 1 step
            char *p, // 1 step
            int aN,  // 1 step
            int ar[] // 1 step) {

    return   // 1 step
        zzz; // 1 step
}

char *course = "EECS281";           // 7 steps (copy chars at runtime)
int HWKs[4] = {100, 110, 120, 140}; // 4 steps (copy at run time)
int retCode =                       // 1 step
    myFunc(                         // 2 steps (call and return);
        course, 4, HWKs);           // 4 steps (each var and return var)

The Program Stack

Function Call Internal Operations

When a function call is made, all local variables are saved in a special storage called the stack. Then, argument values are passed onto the stack.

When a function call is received, the function's arguments are popped off the stack.

When a function issues a return, the return value is pushed onto the stack.

When return is received, the return value is popped off the stack, and saved local variables are restored.

Stack Properties

The stack supports nested function calls, and each has its own set of local variables and arguments.

There is only one program stack (per thread). This is different from the program heap, where dynamic memory is allocated.

Program stack size is limited in practice, so the number of nested function calls is limited based on the size of that stack. This means that "plain recursion" is a bad idea over every element. Use tail recursion or iterative algorithms instead. However, for programs solvable with $O(1)$ additional memory, they do not favor "plain" recursive algorithms.

Exercise: Step Counting

int factorial(int n) {           // 2 steps
    if (n == 0)                  // 1 step
        return 1;                // 2 steps
    return n * factorial(n - 1); // 6 step
}

The last line is 6 steps because there is subtraction, function call, argument passing, internal return, external return, and multiplication.

Exercise: Tail Recursive Power Function

This is an okay implementation. Could be better. Runtime: $O(n)$

int power_recursive(int x, unsigned int y) {
    if (y == 0)
        return 1;
    return x * power_recursive(x, y - 1);
}

This is much better, because it doesn't create additional stack frames per recursive call. Runtime: $O(n)$

int power_tail(int x, unsigned int y, int result = 1) {
    if (y == 0)
        return result;
    return power_tail(x, y - 1, result * x);
}

This is the best implementation here, because it has $O(\log n)$ time.

int power_iterative(int x, unsigned int y) {
    int result = 1;

    while (y > 0) {
        if (y % 2)
            result *= x;
        x *= x;
        y /= 2;
    }

    return result;
}

This is a recursive version of this better logarithmic power function helps us write the runtime:

int power(int x, unsigned int y, int result = 1) {
    if (y == 0)
        return result;
    else if (y % 2)
        return power(x * x, y / 2, result * x);
    else
        return power(x * x, y / 2, result);
}

The runtime is:

$\begin{equation} T(n)=\begin{cases} c_0, & \text{if $n = 0$}.\\ T(n/2) + c_1, & \text{if $n > 0$}. \end{cases} \end{equation}$

So:

$T(n) = T(n/2) + c_1$ $T(n/2) = (T(n/4) + c_1) + c_1$ $\dots$ $T(1) = T\left(\frac{n}{2^k}\right) + kc_1 = c_0 + kc_1$ $n = 2^k$ $k = \log(n)$

Therefore,

$T(n) = c_0 + \log(n) = O(\log(n))$

Common Recurrence Equations

Binary search: $T(n) = T(n/2) + c$
Sequential search: $T(n) = T(n - 1) + c$
Tree traversal: $T(n) = 2T(n/2) + c$
Insertion sort: $T(n) = T(n-1) + c_1*n + c_2$

Solving Recurrences with the Master Theorem

One way to do it is through the telescoping method, which can be difficult at times. This is where the Master Theorem helps.

Let $T(n)$ be a monotonically increasing function that satisfies:

$T(n) = aT\left(\frac{n}{b}\right) + f(n)$ $T(1) = 1$

Where $a \geq 1, b \geq 2$. If $f(n) \in \Theta(n^c)$, then:

$\begin{equation} T(n)=\begin{cases} \Theta(n^{n \log_b a}), & \text{if $a > b^c$}.\\ \Theta(n^c \log_2 n), & \text{if $a = b^c$}.\\ \Theta(n^c), & \text{if $a < b^c$}. \end{cases} \end{equation}$

Note that this doesn't work in the following circumstances:

$T(n)$ is not monotonic, such as $T(n) = \sin(n)$
$f(n)$ is not polynomial, such as $f(n) = 2^n$
$b$ is not a constant, such as when it's a function.
Not dividing $n$ by anything in each step.
$f(n)$ has to be a polynomial.

There is another case which allows polylogarithmic functions for $f(n)$.

Job Interview Question

Write an efficient algorithm that searches for a value in an $n \times m$ array. THis is sorted along rows and columns. That is,

table[i][j] <= table[i][j+1]
table[i][j] <= table[i+1][j]

Obvious way: linear or binary search in every row. Better way: start in the middle.

Solution 1: Quad Partition

Split the region into four quadrants – one can be eliminated. If the number you see is too small, it can't be in the up left quadrant. If it is too big, it can't be in the bottom right quadrant.

$T(n) = 3T(n/2) + c$

By the master theorem,

$T(N) = \Theta(2^{\log_2(3)})$

Solution 2: Binary Partition

Split the region into four quadrants.

Scan a middle row/column/diagonal for the target element. Look for where the 13 should be.
If not found, split where it would have been
Eliminate 2 or 4 sub-regions

$T(n) = 2T(n/2) + cn$

$T(n) = 2T(n/2)$

By the master theorem

$T(n) = \Theta(n \log n)$

$T(n) = \Theta(n)$

Solution 3: Stepwise Linear Search

Start from the top right corner, and only move down or right depending on whether the number at the index is less than or greater than the target.

bool stepwise(int mat[][N_Max], int N, int target, int &row, int &col) {
    if (target < mat[0][0] || target > mat[N-1][N-1]) {
        return false;
    }
    row = 0;
    col = N - 1;

    while (row <= N && col >= 0) {
        if (mat[row][col] < target) {
            row++;
        } else if (mat[row][col] > target) {
            col--;
        } else {
            return true;
        }
    }

    return false;
}

Linear Recurrences

Lienar recurrences are sequences of the form:

$F_n = c_1*F_{n-1} + c_2*F{n-2}$

These appear frequently in many contexts:

Stock-trading strategies
Nice-looking architectural proportions
Nature
Interview questions

These can be calculated recursively (which ends up being terrible), linearly, and with some nifty tricks.