Sorting Algorithms and Decorators in Python

  • |
  • 01 September 2022
Post image

So I needed to practice using decorators in Python. I tried to come up with something within my current knowledge and since I was studying sorting algorithms, I decided to use decorators to measure how long each algorithm would take to complete and compare them. It is kind of obvious which one is better in the proposed scenario, but the purpose was learning and improving coding skills.

Decorators can be very useful when you need to add functionalities to an existing function/method/class, without having to change its code. Let’s say you need to add an additional check, time the function or log but you are using a module instead of writing your own function, or you want to reuse the code more efficiently, then using decorators might be the way to go.

I found a good post at https://kleiber.me/blog/2017/08/10/tutorial-decorator-primer/ , where the author uses the same example I used (comparing sorting algorithms), but he used the pygorithm.sorting module, while I used my own functions.

decorator_tutorial_code.png

There are of course other sorting algorithms, some very known on the Python community such as Timsort, very well explained here. It is not my intention to explore every algorithm, but instead to tinker with a few for practicing.

1. Timing Function

The function we are gonna use as a decorator simply wraps the original function and measures the time before and after it runs, and then prints the elapsed time.

import time
import random

def timing(f):
    def wrap(*args, **kwargs):
        starttime = time.time()
        ret = f(*args, **kwargs)
        finishtime = time.time()
        print('function [{:s}] finished in {:.3f} ms'.format(f.__name__, (finishtime-starttime)*1000.0))
        return ret
    return wrap

2. Quick Sort

The quick sort uses divide and conquer to gain the same advantages as the merge sort, while not using additional storage. As a trade-off, however, it is possible that the list may not be divided in half. When this happens, we will see that performance is diminished.

A quick sort first selects a value, which is called the pivot value. Although there are many different ways to choose the pivot value, we will simply use the first item in the list. The role of the pivot value is to assist with splitting the list. The actual position where the pivot value belongs in the final sorted list, commonly called the split point, will be used to divide the list for subsequent calls to the quick sort.

@timing
def quickSort(L):
   quickSortHelper(L, 0, len(L) - 1)


def quickSortHelper(L, first, last):
   if first < last:
       splitpoint = partition(L, first, last)
       quickSortHelper(L, first, splitpoint - 1)
       quickSortHelper(L, splitpoint + 1, last)


def partition(L, first, last):
   pivotvalue = L[first]
   leftmark = first + 1
   rightmark = last
   done = False

   while not done:
       while leftmark <= rightmark and L[leftmark] <= pivotvalue:
           leftmark = leftmark + 1

       while L[rightmark] >= pivotvalue and rightmark >= leftmark:
           rightmark = rightmark -1

       if rightmark < leftmark:
           done = True
       else:
           temp = L[leftmark]
           L[leftmark] = L[rightmark]
           L[rightmark] = temp

   temp = L[first]
   L[first] = L[rightmark]
   L[rightmark] = temp

   return rightmark

3. Merge Sort

Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves. It works recursevly to continuously divide the (sub)list and then sort it back when it is ordered - when it reaches length=1. At this point the elements are combined back sorted, resulting in a final single sorted list.

Since this algorithm is called recursevly, I had to change the way the timing funcion was called. When using it as a decorator, it returned the runtime for every call of the funcion, which was not what I wanted. So instead I simulated the decorator by calling it as timing(mergeSort)(mergelist), and it then worked as I intended.

#@timing
def mergeSort(L):
    if len(L) > 1:
        # splits the list
        mid = len(L) // 2
        left = L[:mid]
        right = L[mid:]
        # recursevly runs on the splitted lists
        mergeSort(left)
        mergeSort(right)
        i = j = k = 0
        while i < len(left) and j < len(right):
            if left[i] < right[j]:
                L[k] = left[i]
                i += 1
            else:
                L[k] = right[j]
                j += 1
            k += 1
        while i < len(left):
            L[k] = left[i]
            i += 1
            k += 1
        while j < len(right):
            L[k] = right[j]
            j += 1
            k += 1

4. Selection Sort

Selection sort repeatedly finds the minimum element in a list and moves that element to a particular end of the list. The sort keeps going until the array is sorted in order. You can also instruct a selection sort to find the maximum element. Both approaches sort a list.

Selection sorts assume that the first element in a list is the smallest value. The sort will then compare that value with the second element. If the second element is smaller than the minimum value, the second element becomes the minimum value.

This process is repeated until the last element of the list is reached. Once this element is reached, the minimum value is placed at the start of the unsorted list.

It is important to notice that the function mergeSort requires extra space to store both halves as they are splitted during the division process. That additional space can be a critical factor if the list is too big and can make sorting problematic for big sets of data.

@timing
def selectionSort(L):
    for i in range(0, len(L)):
        min_i = i
        for right in range(i + 1, len(L)):
            # selects the smallest element
            if L[right] < L[min_i]:
                min_i = right
        # swaps (sorts) the elements 
        L[i], L[min_i] = L[min_i], L[i]

5. Bubble Sort

Bubble Sort is the simplest sorting algorithm of the three. It works by scanning the list multiple times and repeatedly swapping the adjacent elements if they are in the wrong order, until the list is fully sorted. This sorting algorithm is often considered the most inefficient one, since it has to perform item swaps without knowing its final destination. Those “unecessary” swaps are costly resource and time-wise.

@timing
def bubbleSort(L):
    # starts at the end of the list
    elem = len(L) - 1
    issorted = False
    while not issorted:
        issorted = True
        # swipes the list
        for i in range(elem):
            if L[i] > L[i + 1]:
                # swaps the adjacent elements
                L[i], L[i + 1] = L[i + 1], L[i]
                issorted = False

6. Short Bubble Sort

A bubble sort can be modified to stop early if it finds that the array has become sorted. This means that for arrays that require just a few passes, a bubble sort may have an advantage in that it will recognize the sorted array and stop.

@timing
def shortBubbleSort(L):
    # starts at the end of the list
    elem = len(L) - 1
    issorted = False
    while not issorted and elem > 0:
       issorted = True
       # swipes the list
       for i in range(elem):
           if L[i] > L[i + 1]:
               L[i], L[i + 1] = L[i + 1], L[i]
               issorted = False
       elem = elem - 1

7. Main

Here we gererate a random list, and then call them sequentially with copies of the original list:

def main():
    randomList = random.sample(range(5000), 5000)

    quickSort(randomList.copy())
    timing(mergeSort)(randomList.copy())
    selectionSort(randomList.copy())
    bubbleSort(randomList.copy())
    shortBubbleSort(randomList.copy())

if __name__ == '__main__':
    main()

If we run the code we get the results below:

function [mergeSort] finished in 14.999 ms
function [selectionSort] finished in 810.579 ms
function [bubbleSort] finished in 2917.825 ms
function [shortBubbleSort] finished in 2040.594 ms

You can find the full code on my github page (link on the footer).

8. References:

https://stackoverflow.com/questions/5478351/python-time-measure-function https://stackoverflow.com/questions/10757871/decorating-recursive-functions-in-python https://cs.berea.edu/cppds/Sort/TheMergeSort.html https://cs.berea.edu/cppds/Sort/TheSelectionSort.html https://cs.berea.edu//cppds/Sort/TheBubbleSort.html https://realpython.com/sorting-algorithms-python/

You May Also Like

Web scrapper using Python

Web scrapper using Python

So I needed to learn how to scrape a web page using Python. My teammate suggested that I could learn a few tricks in python by scraping FIIs, which is …