Understanding Mergesort

Mergesort is a sorting algorithm that is interesting to compare with quicksort.

The basic structure of quicksort was this:

The basic structure of mergesort is this:

From this description you can see already what quicksort does different: by first looking at the data and using partial sorting information gained from looking at the input data it can perform better - in some cases - than mergesort can.

Mergesort has one additional drawback: it needs additional memory to store the data. Let's look at two slightly different implementations of Mergesort in Python, and then check the performance compared with Quicksort.

The Unsurprising Implementation

The first implementation works "without surprises", meaning it doesn't use python-specific list manipulation functions. The merge process looks difficult, but it really isn't:

This is the code:

def merge(A, start, mid, stop):

    # create temporary array
    width = (stop+1-start)
    temp = [None] * width

    index_in_lower_half = start
    index_in_upper_half = mid+1
    write_index = 0

    # insert the contents of the already sorted array halfs into the temporary array
    while (index_in_lower_half <= mid) and (index_in_upper_half <= stop):

        a = A[index_in_lower_half]
        b = A[index_in_upper_half]

        if a < b:
            temp[write_index] = a
            index_in_lower_half += 1
        else:
            temp[write_index] = b
            index_in_upper_half += 1

        write_index += 1

    # insert any trailing elements from the lower halg
    if index_in_lower_half <= mid:

        while index_in_lower_half <= mid:
            temp[write_index] = A[index_in_lower_half]
            index_in_lower_half += 1
            write_index += 1

    # insert any trailing elements from the upper half
    if index_in_upper_half <= stop:

        while index_in_upper_half <= stop:
            temp[write_index] = A[index_in_upper_half]
            index_in_upper_half += 1
            write_index += 1

    # write back temp array to the original input array
    write_index -= 1
    while write_index >= 0:
        A[start+write_index] = temp[write_index]
        write_index -= 1

def mergesort_range1(A, start, stop):
    if start < stop:
        mid = start + (stop-start)/2
        mergesort_range1(A, start, mid)
        mergesort_range1(A, mid+1, stop)
        merge1(A, start, mid, stop)

def mergesort1(A):
    if not A:
        return

    mergesort_range1(A, 0, len(A)-1)
    return A

The surprising implementation

Python has lots of builtin list manipulation functions. In Python, allocating temporary lists from builtin functions can be faster than fine-tuned code that doesn't allocate additional memory! The following example is a more pythonesque way of doing mergesort, where the merge-part creates a list dynamically, on the fly.

def mergesort2b(array):
    size = len(array)
    if size < 2:
        return array
    else:
        middle = size / 2
        left = mergesort2(array[:middle])
        right = mergesort2(array[middle:])

        result = []
        i, j = 0, 0
        while i < len(left) and j < len(right):
            if left[i] <= right[j]:
                result.append(left[i])
                i += 1
            else:
                result.append(right[j])
                j += 1
        result += left[i:]
        result += right[j:]
        return result 

This code has the same basic structure, but it doesn't pre-allocate an array of n items, instead it appends the results as it goes along, thereby skipping the need for a write index. And, surprisingly, it turns out to be faster than the code above. There is a further trick that can be applied: list.pop() is typically the fastest way to get an element from the end of a list, and reduce the list size. The only problem is: we're taking items from the front of the two halfs. Well, we can change that: by first reversing the two lists to be merged, we can then compare and pop items from the end of those two lists, and finally reverse them back for the rest. This is how the code looks like, and it was the fastest one I could come up with in Python:

def mergesort2(array):
    size = len(array)
    if size < 2:
        return array

    middle = size / 2
    left = mergesort2(array[:middle])
    right = mergesort2(array[middle:])

    # reverse both lists, so that we can take items from the back of the list
    left.reverse()
    right.reverse()

    result = []

    while left and right:

        # note: no read indices, and we're popping items from the back of the list
        if left[-1] <= right[-1]:
            result.append(left.pop())
        else:
            result.append(right.pop())

    # reverse lists back, so that they can be added to the result list
    left.reverse()
    right.reverse()
    if left:
        result.extend(left)
    if right:
        result.extend(right)
    return result 

But how does it perform compared with Quicksort? Well, actually disappointingly so. For the standard test of 1 million floats, these were the results:

5.83 5.69 5.69 5.68 5.71
Method Run 1 Run 2 Run 3 Run 4 Run 5
Quicksort Hetland
Quicksort Pyalda 6.386.316.306.376.33
Mergesort8.108.088.088.098.01

Looking at this, you may wonder why bother with Mergesort at all? Well, do a simple example: try sorting an already sorted list of 1 million floats. You will find that while mergesort runs as expected (actually, a bit faster), both Quicksort implementations will break and throw a RuntimeError: maximum recursion depth exceeded in cmp, because their recursion depth will exceed the maximum. You cannot sort a million floats in (our recursive implementation of) quicksort, because it'll essentially require a million-deep stack, and unless you are really rich your machine won't support that.
This is what was mentioned at the top of this page: mergesort is boring, but reliable. quicksort is a genius, but sometimes it'll disappoint you badly.

And now, for a real surprise, turn to Heapsort next...