Linked Lists
Linked Lists are a staple data type. In C/C++, they are a quintessential means of keeping track of dynamically allocated objects in a space-efficient manner. And since linked-list related algorithms typically require lots of intricate pointer manipulation, they often do come up in programming interview questions.
Now, in Python, you have a builtin list
datatype, so you could ask: why bother?
Well, for one you'd want to be prepared for the same type of questions, and secondly
it is a nice exercise. So let's get cracking
Single Linked Lists vs. Double Linked Lists
Single Linked Lists operate with a single next
pointer, Double Linked Lists have both
a next
and a previous
. Since operations with a single next
pointer tend to be more complicated,
we're going to focus on single linked lists below.
Core Implementation
How do you implement a pointer in a programming language that ostensively doesn't have pointers? By finding out that of course Python has pointers, it just doesn't call them pointers and thus hopes to fool you into believing that it is any better than plain old C because of that. Actually, having learned programming in assembler I've never quite understood what the fuzz was about pointer algorithms, but well, it is a time honoured tradition to call pointer manipulation difficult.
So, how does a non-pointer pointer in Python look like? It is simply an object reference. That means, the quintessential Python Single Linked Node / List datatypes look like this:
class SingleLinkedNode(object):
def __init__(self, value, next = None):
self.value = value
self.next = next
class SingleLinkedList(object):
def __init__(self):
self.first = None
That wasn't too bad, was it? Now, let's get cracking by adding some functionality to it.
Inserting a node at the head
In a Single Linked List, you typically insert nodes at the head. The code borders on the trivial:
class SingleLinkedList(object):
...
def AddValueAtHead(self, value):
self.first = SingleLinkedNode(value, self.first)
Printing list objects
To be able to visualize a list, it makes sense to overwrite __repr__
. And the easiest way to visualize
a Single Linked List is to print it as a Python list
. So:
class SingleLinkedList(object):
...
def __repr__(self):
temp = []
node = self.first
while node:
temp.append(node.value)
node = node.next
return repr(temp)
Getting the length of a list
To find the length of the list, you must enumerate all items in the list, and keep track of the count as you go along.
class SingleLinkedList(object):
...
def GetLength(self):
size = 0
node = self.first
while node:
size += 1
node = node.next
return size
Accessing the N-th Element in a list
To access the n-th element in a Single Linked List, you basically must do the same: enumerate all items, keep track of the count = index as you go along, and return the list if the count is n.
class SingleLinkedList(object):
...
def GetNthNode(self, n):
if n == 0:
return self.first
if self.first:
m = 0
node = self.first
node = node.next
while node:
m += 1
if m == n:
return node
node = node.next
return None
Note that this code handles a couple of special cases:
- If you ask for the 0-th node, you'll get
self.first
even if the list is empty: because if it is empty,self.first
will beNone
. - If the list is empty and you ask for the n-th node with n > 0, then you'll always get
None
. - If the list has n items and you ask for the k-th node with k > n, then you'll always get
None
.
So far, everything has been pretty gentle. Now let's ask some more difficult questions.
Write a function that merges two sorted lists
This is problem 4.1 in the excellent Elements of Programming Interviews. The first question should be: how can I test this?
- We need to create two sorted lists of random elements
- We merge those two lists with our algorithm
- We verify that the combined list is still sorted
This translates to the following code:
#! -*- Encoding: Latin-1 -*-
def selftest():
SL1 = SingleLinkedList.CreateRandomSortedList(10)
SL2 = SingleLinkedList.CreateRandomSortedList(7)
SL1.MergeSortedList(SL2)
assert SL1.IsSortedAscending()
if __name__ == "__main__":
selftest()
Of course, this code leaves most of the work undone ;) There are three functions in this test code that we need to implement.
The easiest one is the function to create a list of random, ascending values: create a list of n random values, sort them, reverse them,
and then insert them into the list. You may ask why it is necessary to reverse the list: because the Single Linked List mostly uses the AddValueAtHead
method, not a AddValueAtTail
which would work without the need for reversing the input. Given all that, our code looks like this:
class SingleLinkedList(object):
...
@staticmethod
def CreateRandomSortedList(number_of_elements):
values = [random.randrange(1860) for k in range(number_of_elements)]
values.sort()
result = SingleLinkedList()
for value in reversed(values):
result.AddValueAtHead( value )
return result
Note that the method is decorated with @staticmethod
, so that you can call it to create a new instance of SingleLinkedList, rather
than the need to invoke it on an existing instance.
IsSortedAscending
is quite easy as well: enumerate all items in a list, and ensure that each consecutive item is >= the last seen item.
class SingleLinkedList(object):
...
def IsSortedAscending(self):
node = self.first
if node:
last_known_value = node.value
node = node.next
while node:
if node.value < last_known_value:
return False
last_known_value = node.value
node = node.next
return True
That leaves us with the function to actually merge two sorted lists. And I'm going to warn you: the code is quite ugly (and long, by Python standards). Let's try to see the main logic. Assume we want to merge the two lists A and B into A', both of which are sorted.
- special case: A is empty -> return A' = a clone of B
- special case: B is empty -> return A' = A
- for each B[i] < A[0]: insert B[i] before A[0]
- if there are nodes left in B (it could be that all B[k] < A[0]):
------ while A[i] < B[0]: move i one up
------ while B[k] <= A[i]: insert B[k] before A[i]
------ if no more A[k]: append remainder of B, done.
------ if no more B[k]: done
With all the special cases, you'll note that this algorithm has to take care of a lot of rough edges, making it quite long and ugly (I did warn you, didn't I?)
class SingleLinkedList(object):
...
def MergeSortedList(self, other_list):
# ensure that the algorithm preconditions are met
assert other_list.IsSortedAscending()
assert self.IsSortedAscending()
# special case 1: A is empty -> return A' = a clone of B
if self.first is None:
self.Clone(other_list)
# special case 1: B is empty -> return A' = A
elif other_list.first is None:
pass # nothing to see here, move along
else:
# for each B[i] < A[0]: insert B[i] before A[0]
this_insert_pos = None
this_node = self.first
other_node = other_list.first
while (other_node is not None) and (other_node.value <= this_node.value):
if this_insert_pos is None:
self.first = SingleLinkedNode(other_node.value, self.first)
this_insert_pos = self.first
else:
this_insert_pos.next = SingleLinkedNode(other_node.value, this_insert_pos.next)
this_insert_pos = this_insert_pos.next
other_node = other_node.next
if other_node is not None:
# now, start moving our position until we've reached the first node in our list that is higher
# than we want it to be
this_insert_pos = self.first
this_node = self.first.next
assert this_insert_pos.next == this_node
# if no more B[k]: done
while other_node is not None:
assert this_insert_pos.next == this_node
# while A[i] < B[0]: move i one up
while (this_node is not None) and (this_node.value < other_node.value):
this_node = this_node.next
this_insert_pos = this_insert_pos.next
assert this_insert_pos is not None
assert this_insert_pos.next == this_node
if this_node is None:
# if no more A[k]: append remainder of B, done.
assert this_insert_pos is not None
assert this_insert_pos.next is None
while other_node is not None:
this_insert_pos.next = SingleLinkedNode(other_node.value, this_insert_pos.next)
this_insert_pos = this_insert_pos.next
other_node = other_node.next
break
# while B[k] <= A[i]: insert B[k] before A[i]
while (other_node is not None) and (other_node.value <= this_node.value):
this_insert_pos.next = SingleLinkedNode(other_node.value, this_insert_pos.next)
other_node = other_node.next
this_insert_pos = this_insert_pos.next
Note: if you look at the solution in EPI, you'll find that it is one of the (really few) cases where the C++ code is actually cleaner than the Python code. This is so because the C++ code can indulge on direct pointer manipulation, whereas Python code cannot. So Goal 1:0 for C++ here.
Find out if two lists overlap, and where
This is problem 4.4 in EPI, and it is interesting because there is a trivial solution that uses memory, and another solution that is trivial to understand, but it took me a while to figure it out.. Did I tell you that Elements of Programming Interviews is really rather good? OK, as usual, our first question should be: how can I test this?
- We need to create two lists of random elements
- We need to overlap them at some random point
- We verify that the combined list is still sorted
This translates to the following code:
#! -*- Encoding: Latin-1 -*-
def selftest():
SL1 = SingleLinkedList.CreateRandomSortedList(10)
SL2 = SingleLinkedList.CreateRandomSortedList(7)
SL1.OverlapAtRandomNode(SL2)
assert SL1.IsOverlapped(SL2)
assert SL2.IsOverlapped(SL1)
if __name__ == "__main__":
selftest()
Note: I am reusing CreateRandomSortedList
from above. It is not necessary that the lists be sorted, I am just being lazy.
So we are missing two methods here: OverlapAtRandomNode
and IsOverlapped
. Let's first look at the logic for creating
a overlapped list out of two existing lists A and B:
- Move to the end of list B.
- Find a random node in A.
- Add the remainder of A to the end of B
In code:
class SingleLinkedList(object):
...
def OverlapAtRandomNode(self, other_list):
assert self.first # cannot work if this is an empty list
assert other_list.first # cannot work if the other list is empty
# Move to the end of list B
node = self.first
while node and node.next:
node = node.next
assert node.next is None
# Find a random node in A
N = other_list.GetLength()
K = random.randrange(N)
# Add the remainder of A to the end of B
node.next = other_list.GetNthNode(K)
OK, now for the fun part: how to find out if the two nodes A and B do overlap. The simple solution is to keep track
of nodes visited, for example by hashing their id
s. In Python, we can use the set
datatype for this. The logic
will look somewhat like this:
- Add the IDs of all items in B to a set
- Enumerate A: for each item, check if it is contained in the lookup set.
- If it is, we've found the overlap.
In code:
class SingleLinkedList(object):
...
def IsOverlapped(self, other_list):
# Add the IDs of all items in B to a set
h = set()
node = other_list.first
while node:
h.add(id(node))
node = node.next
# Enumerate A: for each item, check if it is contained in the lookup set.
node = self.first
while node:
# If it is, we've found the overlap.
if id(node) in h:
return node
node = node.next
Note: I have omitted an explicit return None
at the end of the list, because Python will add this implicitly.
Now, this was the trivial implementation - but it uses O(M) additional space, where M = len(B). Can we do it without additional memory? Think about what you know about each item in a list. You know:
- The item value
- If the item has a successor or not
- The value of the item successor
Given two nodes from two distinct lists, comparing either condition 1 or 3 would require comparing it to all other nodes in the other list. But condition 2 can be checked without comparing it to the other list. And if you think about it: when you have a merged list, the end node must be the same - by definition. So what you can do to find out if the two lists are merged is this:
- Find the last node in A, Alast.
- Find the last node in B, Blast.
- If Alast == Blast, the list is merged.
- Now, you only need to find exactly where the two lists merge
How do you solve that problem?
- List A has
n
items - List B has
m
items - The combined list has either
m
orn
items, and to be more precise: it hasmax(m,n)
items. - Assume for now that A is the longer list, so
m
<n
. How many items of B are in A, at max? Well, all of them. That means the first possible merge position if n-m.
This implies the following logic:- Set i = position in A = m-n
- Set k = position in B = 0
- while Ai is not Bk: increase both i and k
If you think about it, this algorithm is easy to understand, and probably faster because it doesn't need either memory allocation nor set lookup. The only drawback: the code is a bit longer.
class SingleLinkedList(object):
...
def IsOverlapped(self, B):
A = self
# find last node in A, element count in A
items_in_A = 0
last_in_A = self.first
while last_in_A:
items_in_A += 1
if not last_in_A.next:
break
last_in_A = last_in_A.next
# find last node in B, element count in B
items_in_B = 0
last_in_B = B.first
while last_in_B:
items_in_B += 1
if not last_in_B.next:
break
last_in_B = last_in_B.next
# last items are different - lists cannot be merged
if last_in_A != last_in_B:
return None
# if necessary, swap the lists, so that len(A) > len(B)
if items_in_B > items_in_A:
A, B = B, A
items_in_A, items_in_B = items_in_B, items_in_A
i = items_in_A - items_in_B
# move node_in_A to first possible insert position
node_in_A = A.first
while i > 0:
assert node_in_A is not None
node_in_A = node_in_A.next
i -= 1
node_in_B = B.first
while node_in_A != node_in_B:
assert node_in_A is not None
assert node_in_B is not None
# move both nodes one up:
node_in_A = node_in_A.next
node_in_B = node_in_B.next
return node_in_B
Zip a list
OK, one final problem from EPI: 4.11. You have a list: returned a zipped list. What this means is the following:
- Input L = [ L0, L1, ..., Ln ]
- Output L' = [ L0, Ln-1, L1, Ln-2... ]
And of course you should do this with constant additional memory. Test code is trivial: just create a list of random items, zip it, print both lists and visually compare them.
#! -*- Encoding: Latin-1 -*-
def selftest():
SL1 = SingleLinkedList.CreateRandomSortedList(20)
print "BEFORE: %r" % (SL1, )
SL1.ZipSelf()
print " AFTER: %r" % (SL1, )
if __name__ == "__main__":
selftest()
So let's think about the algorithm:
- The zipped list must have the same number of items as the input list
- Given that the list has
n
items, we have to removen/2
items from the end and interleave them at the front. - This translates into the following:
- Find
n
by enumerating all items and counting them - Start again, and move to
n/2
. - Create a temporary list that contains all items from
n/2
up ton
, in reversed order. This is actually trivial, since consecutive insertion at the front will result in a reversed list. - You now have two lists:
L'
, which has onlyn/2
items, andtemp
, which also hasn/2
items (well, maybe one less or so). - Now you can enumerate both lists and remove items from the top of
temp
, inserting them inL'
.
- Find
In python code:
class SingleLinkedList(object):
...
def ZipSelf(self):
# find n
n = self.GetLength()
# find the n/2th node
mid = n/2-1
m = self.GetNthNode(mid)
# split at m, and create a reversed list in temp
temp = SingleLinkedList()
k = m.next
m.next = None
while k:
next_k = k.next
k.next = temp.first
temp.first = k
k = next_k
# merge the two lists, interleaving items as we go along
node = self.first
insert = True
while node:
if insert:
k = node.next
assert temp.first is not None
node.next = temp.first
temp.first = temp.first.next
node.next.next = k
insert = False
else:
insert = True
# first node in this list is kept as-is
if not node.next:
# last node in this list reached: insert remainder of other list (if exists)
node.next = temp.first
break
# skip this node
node = node.next