Understanding Quicksort

This article is a little bit different. The other articles on this page typically work like this:

Describe the problem
Setup a test environment
Work out a couple of solutions and compare each

Here I want to show you a different approach to Quicksort, and some intricacies of doing Quicksort with Python.

Revisiting the partitioning problem

If you haven't read the article on partitioning, please do so now. Go ahead, I'll wait.

OK, let's revisit the function to partition an array inplace.

def partition_inplace_value(A, start, stop, pel_value):

    # enumerate each item
    read_index = start
    while read_index <= stop:
        item = A[read_index]

        # if the current item is less than the pivot value
        if item < pel_value:

            # swap it at the insertion position
            A[start], A[read_index] = A[read_index], A[start]

            # and move the insertion position one up
            start += 1

        read_index += 1

    return start

What we are doing here is spliting the array in two halfs:

a lower half that is < the pivot value
an upper half that is >= the pivot value

In a way, this is already some order imposed on the array, right? The basic idea of quicksort is this: Using Divide and Conquer, impose order on subarrays around pivots. By subsequently making those subarrays smaller, we'll end up with arrays that are sorted. Example: split the array in two halfs:

a lower half that is < the pivot value
an upper half that is >= the pivot value. Split this array in two halfs:
- a lower half that is < the pivot value
- an upper half that is >= the pivot value. Split this array, until finally we've reached a small array that is trivially sorted.
  Returning from that recursion level means that the two halfs are sorted twice: because one is the lower half, it has items less than the pivot, and that goes recursively for all its sub-arrays

Does that make sense? It sure did make a lot of sense to me when I first realized it. It also explains why the literature often says Quicksort is a randomized algorithm: because this partitioning is a process that splits the array in two halfs, but you cannot say how long each half is. So the two recursion parts (for lower and upper half) will have to work on different sizes - random sizes from the outside world

OK, so can we start doing Quicksort with our partition function now? Not quite: there is a small but important detail we must change in this function: we must ensure that the pivot element starts the upper half. Why?

One easy to see reason is that it actually increases the order in the upper half. There are other, more difficult reasons, but for now let's suspend disbelief and look at some good old fashioned code:

def partition(A, start, stop):
    # see explanation in the text below
    pel_index = partition_policy(start, stop)
    
    # move pel to the rightmost position
    pel_value = A[pel_index]
    A[stop], A[pel_index] = pel_value, A[stop]
    
    # separate items < pel_value from items > pel_value
    insert_pos = start
    index = start
    while index < stop:
        if A[index] < pel_value:
            A[insert_pos], A[index] = A[index], A[insert_pos]
            insert_pos += 1
        index += 1
    
    # move pel back to the position that splits the array in two halfs
    A[insert_pos], A[index] = A[index], A[insert_pos]
    
    return insert_pos

Some notes on this function

The partition examples seen previously always had the partition element determined from the outside. It wasn't even necessary for the partition element to be part of the array to be partitioned. But for quicksort, we're actively choosing an element, and the element must be part of the input array. Now, there are multiple variants of how to choose a good pivot for quicksort. The easiest one is to choose the top-most value always, but a better one is to choose a random index from the input array. Hence in the code above there is a policy function defined, and it chooses the index of the partition element from the input array.
The way we ensure that the partition element starts the upper half is also quite simple:
- We move the element to the top first
- We do the normal partitioning for all items except the top one (note that the original function has a loop while read_index <= stop:, while the new one uses while index < stop:)
- At the end, we move the top element right back to the start of the upper half, thus ensuring it starts this side of the partition

Now, given the partition function, the rest of quicksort is an anti-climax:

def quicksort_recursive(A, start, stop):
    if start < stop:
        pel_index = partition(A, start, stop)
        quicksort_recursive(A, start, pel_index-1)
        quicksort_recursive(A, pel_index+1, stop)    

def quicksort(A):
    quicksort_range(A, 0, len(A)-1)
    return A

Policies for choosing a pivot element

Given our generalized approach with a partition policy, it is easy to implement and test different pivot choice strategies:

def partition_policy_simple(start, stop):
    return stop

def partition_policy_median(start, stop):
    return (start+stop)/2

def partition_policy_random_element(start, stop):
    return random.randint(start, stop)

partition_policy = partition_policy_random_element

Note that the literature recommends the random choice, so we're going to set that as a default

Alternative Python Implementations

If you google for Python Quicksort, you'll quickly find a reference implementation by Marcus Lie Hetland. The main body of the code is identical to ours (the code is so simple and straight forward, it is hard to see how it could be much different, anyhow). But it is instructive to compare that partition function with ours, because it uses a different approach: Rather than visiting each element consecutively, this code switches search directions.

Another alternative that pops up in stackoverflow.com uses list comprehension, somewhat like this:

def qsort_using_list_comprehension(L):
    if len(L) < 2: 
        return L
        
    pivot_element = random.choice(L)
    
    small  = [i for i in L if i<  pivot_element]
    medium = [i for i in L if i== pivot_element]
    large  = [i for i in L if i>  pivot_element]
    return qsort_using_list_comprehension(small) + medium + qsort_using_list_comprehension(large)

If you remember the partition document, the very first brute force array looked very similar. It should preform not as good as we would like, and obviously it allocates a lot more memory.

A simple reformulation of this idea uses the filter() function rather than list comprehension, like this:

def qsort_using_filter(L):
    if len(L) < 2: 
        return L
        
    pivot_element = random.choice(L)
    
    small  = filter(lambda x: x <  pivot_element, L)
    medium = filter(lambda x: x == pivot_element, L)
    large  = filter(lambda x: x >  pivot_element, L)
    return qsort_using_filter(small) + medium + qsort_using_filter(large)

So, given our four candidates, how do they perform on an array with a million random floats?

                  quicksort hetland:  5.82
                   quicksort Pyalda:  7.51
           using list comprehension:  8.21
                       using filter: 13.39

OK, quite obviously list-comprehension and filter() are ruled out, both because they are too slow, and because they take up too much memory. Can we improve the runtime of our function?

Well, there is a small improvement we can do that brings the numbers closer together:

def partition(A, start, stop):
    pel_index = stop # partition_policy(start, stop)

Here are 5 test runs, with one million random floats each:

Method	Run 1	Run 2	Run 3	Run 4	Run 5
Quicksort Hetland	5.83	5.69	5.69	5.68	5.71
Quicksort Pyalda	6.38	6.31	6.30	6.37	6.33

So this little change has brought the margin closer together - but why is the other partition function still consistently better?

Well, if you study the code it is obvious that it must be a benefit that outweighs even the fact that the function has a lot more lines of code (which in Python is interpreted, and makes things slower than they should). It turns out the main benefit of the Hetland solution is that it swaps less. It doesn't swap as soon as it finds one item, it swaps when it finds a good position for both sides of the swap pair. So it is a reference implementation for a reason :)