Lab 5: Merge Sort And Brute Force Kendall Tau (6 Points)
Chris Tralie
Overview / Logistics
The purpose of this lab is to get you practice with sorting algorithms and recursion. It will also warm you up for the next assignment on fair voting and rank aggregation
Click here to download the starter code for this lab. When you are finished, upload your sort.py
and kendalltau.py
files to canvas.
Part 1: Merge Sort (3 Points)
Fill in the mergesort_rec
and merge
methods in sort.py
to complete a merge sort implementation Click here to review how this works. A recursive call to merge sort puts the elements between indices i1
and i2
in order by breaking this range into half, sorting each half individually, and merging the sorted versions together. Pictorially, the sorted region looks like this:
The recursive pseudocode for merge sort is as follows:
- mid = (i1 + i2) // 2
- recursively sort [i1, mid)
- recursively sort [mid, i2]
- Merge these two together
You will also have to consider a base case. Then, if you implement the merge method and put this all together, you should end up with a sorted array.
Part 2: Brute Force Kendall Tau Distance (3 Points)
Read the notes here on the Kendall-Tau distance. On the next homework, you will implement a clever algorithm that takes only O(N log N) time to compute this, but for now, you will implement a simple algorithm that takes O(N2) time, but which follows directly from the definition of Kendall tau.
To see how the algorithm works, let's consider the following two permutations:
0, 4, 3, 1, 2
1, 4, 2, 3, 0
The first thing we should do is convert these into rankings, or arrays that record the index that each element shows up in.. If, ranking[i] > ranking[j]
, that means that i
comes after j
. So, for example, in the first permutation, element 4 has a ranking of 1, and element 2 has a ranking of 4, so we see that element 4 actually comes before and beats element 1. All of the rankings for each permutation look as follows:
0 | 1 | 2 | 3 | 4 |
0 | 3 | 4 | 2 | 1 |
4 | 0 | 2 | 3 | 1 |
Now we just need to check every pair of rankings.
Number 1 | Number 2 | Rank #1 > Rank #2 | Rank #1 > Rank #2 | Disagreement? |
0 | 1 | No | Yes | * |
0 | 2 | No | Yes | * |
0 | 3 | No | Yes | * |
0 | 4 | No | Yes | * |
1 | 2 | No | No | |
1 | 3 | Yes | No | * |
1 | 4 | Yes | No | * |
2 | 3 | Yes | No | * |
2 | 4 | Yes | Yes | |
3 | 4 | Yes | Yes |
For a total of 7 discordant pairs if we count the stars where they disagree