Understanding Heapsort

Heapsort follows a very simple idea:

Subsequently, a heapsort implementation typically is trivial. The following code uses the MinHeap class developed in the heaps article:

def heapsort(A):
    h = MinHeap()

    # push-em-all
    for item in A:
        h.Insert(item)

    # pop-em-all
    for i in range(len(A)):
        A[i] = h.ExtractMin()

The same code, using the heapq module from the Python Standard Library:

def heapsort(A):
    h = []

    # push-em-all
    for value in A:
        heappush(h, value)

    # pop-em-all
    for i in range(len(A)):
        A[i] = heappop(h)

The real fun starts if you compare this to the quicksort implementations developed in the quicksort article. It turns out that heapsort is dramatically faster than quicksort, in Python. Probably because of a combination of simplicity of algorithm and lack of recursion (and Python doesn't perform particularily well on recursion).

Method Run 1 Run 2 Run 3 Run 4 Run 5
Quicksort Hetland 6.10 6.22 6.39 6.47 6.68
Quicksort Pyalda 6.596.916.866.876.98
Heapsort with heapq3.263.163.143.173.08

At the end of mergesort we pointed out a situation where both quicksort implementations break badly: if asked to sort an already sorted array. Well, here are the results of comparing 1 million sorted float numbers in both heapsort and mergesort (quicksort won't sort either, because it'll exceed its maximum recursion depth):

Method Run 1 Run 2 Run 3 Run 4 Run 5
Heapsort with heapq 2.19 2.26 2.64 2.19 2.19
Mergesort with reverse()5.365.435.385.445.36

I, for one, welcome our new sorting overlord!