Data Structures and Algorithms

Sorting Methods

Computational tasks and solutions

In sorting, we're trying to sort records in a sequence by keys, with respect to an operator or a functor.

Simple ≠ Useless

  • Simple sorting algorithms give us useful illustrations on how to approach a problem
  • Easy to experiment with
  • Easy to compare multiple solutions
  • Easy to implement
  • Sometimes "good enough"

Accessing Containers

Sorting in the STL is done either using iterators for container structures, or using indices and arrays.

How Size Affects Sorting

  • Internal sort: Small enough to fit in memory
    • Constant time random access
  • Indirect sort: For larger items, where you can see all indices
    • Reorder indices rather than items
  • External sort: What you need to sort cannot fit in memory
    • Items to be sorted are on disk

Building Blocks

operator<   // compare two items
operator[]  // access k-th element
swap():     // swap A and B
copmswap()  // compare two items, and if B is smaller call swap()
mySort()    // algorithm/implementation
main()      // testing it out

swap() implementation

Custom implementation:

template <typename T>
void swap(T &a, T &b) {
    T tmp = a;
    a = b;
    b = tmp;
}

Don't do this! Just use #include <utility>. Usually, in C++11, there are speed improvements for swapping for some data types, and this implementation is the worst case.

template <typename T>
void compswap(T &a, T &b) {
    if (b < a) {
        swap(a, b);
    }
}

Desirable Quality of Algorithms

Stability: preserve order of duplicate keys. Simple sorts tend to be stable, and complex sorts are not stable.

Non-adaptive sort: The sequence of operations is independent of order of data. The algorithm is going to do the same amount of work on 10 items where there is only 1 unsorted thing, and the same when all are unsorted.

Adaptive sort: If all are mostly sorted, then performs differently when mostly sorted. Worst case complexity is not equal to best case complexity.

Bubble Sort

void bubble(item a[], int left, int right) {
    for (int i = left; i < right; i++)
        for (int j = right - 1; j > i; j--)
            compswap(a[j - 1], a[j]);
}

Finds maximum item on list on first pass by comparing adjacent items, and moves item all the way to the left.

How to make it adaptive?

You just need some more information.

void bubble_adaptive(item a[], int left, int right) {
    for (int i = left; i < right; i++) {
        bool swapped = false;
        for (int j = right - 1; j > 1; j--) {
            if (a[j] > a[j - 1]) {
                swapped = true;
                swap(a[j - 1], a[j]);
            }
        }
        if (!swapped) {
            break;
        }
    }
}

Bubble Sort Analysis

Non-adaptive

  • Around \(n^2\) comparisons
  • Around \(n^2\) swaps in all cases

Adaptive

  • Around \(n^2\) comparisons
  • Around \(n\) swaps in best case, \(n^2\) average

Why Bubble Sort?

  • Simple to implement and understand
  • Completes some "pre-sorting" while searching for the smallest key
  • Adaptive version may finish quickly if the input array is almost sorted

Selection Sort

void selection(item a[], int left, int right) {
    for (int i = left; i < right; i++) {
        int min = i;
        for (int j = i + 1; j < right; j++) {
            if (a[j] < a[min]) {
                min = j;
            }
        }
        swap(a[i], a[min]);
    }
}

Complexity

  • \(n^2\) comparisons
  • \(n - 1\) swaps in the best, average, worst case
  • Non-adaptive: runtime is insensitive to input

Making it adaptive

void selection(item a[], int left, int right) {
    for (int i = left; i < right; i++) {
        int min = i;
        for (int j = i + 1; j < right; j++) {
            if (a[j] < a[min]) {
                min = j;
            }
        }
        if (i != min) swap(a[i], a[min]); // changed
    }
}

Complexity of Adaptive

  • Same as above, but \(0\) swaps in worst case

Why Selection Sort?

  • Minimal copying of items
  • In practice, hard to beat for small arrays

Why not?

  • Minimal opportunities for tuning

Insertion Sort

Much faster.

void insertion(item a[], int left, int right) {
    for (i = left+1; i < right; i++)
        for (j = i; j > left; j--)
            compswap(a[j-1], a[j]);
}

Faster implementation:

void insertion(Item a[], int left, int right) {
    for (int i = left + 1; i < right; i++) {
        Item v = a[i]; int j = i;
        for (int j = i; j > left; j--) {
            compswap(a[j - 1], a[j]);
            if (v < a[j - 1])
                a[j] = a[j - 1];
            else
                break;
        }
        a[j] = v;
    }
}

Adaptive:

void insertion(Item a[], int left, int right) {
    for (int i = right - 1; i > left; i--)  // find min item
        compswap(a[i - 1], a[i]);           // put in a[left]
                                            // this is sentinel
    for (int i = left + 2; i < right; i++) {
        Item v = a[i]; int j = i;                   // v is new item
        while (v < a[j - 1] && (j > left)) {        // v in wrong spot
            a[j] = a[j - 1]; // half swap (move)
            j--;
        }
        a[j] = v;
    }
}