Sorting

CSCI 1913 – Introduction to Algorithms, Data Structures, and Program Development
Adriana Picoral

The sorting Problem

Given a list or array of elements (usually of uniform type)

Return:

  • Nothing (input list is modified) – in-place sorting
  • A new list in sorted order

Post-Condition: after the function, the list is sorted:

  • (ascending) each element is ≥ the one before it
  • (descending) each element is ≤ the one before it

Thinking about this problem

Imagine you have the following unsorted list:

[ 30 | 10 | 3 | 45 | 50 | 100 | 0 ]

How would you go about sorting it?

Sorting

It is one of the fundamental problems in computer science, with many solutions

  • there are many sorting algorithms
  • some are faster/slower than others
  • some use more/less memory than others
  • some work better with specific kinds of data
  • some can utilize multiple computers / processors

Many Solutions

  • bogo sort
  • bubble sort
  • selection sort
  • insertion sort
  • shell sort

Many Solutions

  • merge sort
  • heap sort
  • quick sort
  • bucket sort
  • radix sort
  • timsort

Types of Sorting Algorithms

Comparison Sort

  • < and > provide “order knowledge”
  • Most general and studied type of sort
  • \(O(N \log N)\) (probable limit on worst-case speed)

Non-comparison sort

  • Other forms of “order knowledge”
  • Usually data-type specific
  • Can be up to \(O(N)\) worst case (under the right conditions)

Bogo Sort

Bogo Sort

Bogo sort: Orders a list of values by repetitively shuffeling them and checking if they are sorted.

Name comes from “bogus”

import random

def is_sorted(lst):
    for i in range(len(lst)-1):
        if lst[i] > lst[i+1]:
            return False
    return True

def bogo_sort(lst):
    while not is_sorted(lst):
        random.shuffle(lst)
    return lst

What is the runtime?

Bogo sort runtime

  • Best: \(O(N)\)
  • Average: \(O(N*N!)\)
  • Worst: Unbounded, runs forever

Sorting high-level overview

\(O(N^2)\) algorithms:

  • bubble sort
  • insertion sort
  • selection sort

\(O(N^{1.5})\) algorithm: Shell Sort

Sorting high-level overview

\(O(N \log N)\) algorithms:

  • quick sort
  • heap sort
  • merge sort
  • timsort

\(O(N)\) algorithm: radix sort

  • radix sort is not a comparison sort

Quiz 04

You have 10 minutes to complete the quiz

Check ONE option (zero will be awarded to quizzes with multiple answers checked)

Selection sort

Selection sort

Approach: “select” item for each location and swap:

  • find smallest item, place in 1st position
  • find second smallest, place in 2nd position
  • find smallest from unsorted part, place where it belongs

Animation of what Selection Sort looks like

Selection sort

Mid search snapshot:

  • List has two “zones”
  • Sorted Zone
    • is sorted
    • contains the smaller part of the original list
  • Unsorted Zone
    • is not sorted
    • contains the larger part of the original list

Implement Selection Sort – strategies

Nested loops:

  • Outer loop repeats n-1 times, n being size of list
  • Initialize a small variable in between the outer and the inner loop to hold the smallest value index
  • Inner loop finds the index of the smallest value (compare each value at index small with the current index at inner iteration, updating small whenever a smaller value is found)

Swapping in place:

lst[i], lst[small] = lst[small], lst[i]

Selection Sort – implementation

def selection_sort(lst):
    for i in range(len(lst) - 1):
        # Find index of smallest
        small = i
        for j in range(i + 1, len(lst)):
            if lst[j] < lst[small]:
                small = j
        # Swap smallest into place
        lst[i],lst[small] = lst[small],lst[i]
    
if __name__ == "__main__":
    fruit = ["blackberry", "apple", "papaya",  "cantaloupe", "apricot",  "banana"]
    selection_sort(fruit)
    print(fruit)
['apple', 'apricot', 'banana', 'blackberry', 'cantaloupe', 'papaya']

Selection Sort runtime

What’s Selection Sort’s runtime?

Selection Sort runtime

What’s Selection Sort’s runtime?

\(O(N^2)\)

Bubble sort

Bubble sort

Approach: While the list is not sorted, iterate through the list swaping adjacent pairs.

Animation of what Bubble Sort looks like

Called “bubble” sort because large numbers bubble to the “top” (end of the list)

Bubble sort

Mid search snapshot:

  • List has two “zones”
  • Sorted Zone
    • is sorted
    • contains the largest numbers of the original list sorted
  • Unsorted Zone
    • is not sorted
    • contains the smalles part of the original list

Implement Bubble Sort – strategies

Nested loops:

  • Outer loop can either repeat n times, n being size of list, or until list is sorted
  • Inner loop repeats n-1 times, checking current value against next value, swapping if current value is larger than next value.

Swapping in place:

lst[j], lst[j+1] = lst[j+1], lst[j]

How do we know a list is sorted? If there were not swaps, the list is sorted.

Bubble Sort – implementation 1

def bubble_sort(lst):
    for i in range(len(lst)):
        for j in range(len(lst)-1):
            if lst[j]> lst[j+1]:
                lst[j], lst[j+1] = lst[j+1], lst[j]

Bubble Sort – implementation 2

def bubble_sort(lst):
    list_sorted = False
    while not list_sorted:
        list_sorted = True
        for j in range(len(lst)-1):
            if lst[j] > lst[j+1]:
                lst[j], lst[j+1] = lst[j+1], lst[j]
                list_sorted = False

Bubble Sort runtime

What’s Bubble Sort’s runtime?

Bubble Sort runtime

What’s Bubble Sort’s runtime?

\(O(N^2)\)

Insertion Sort

Insertion Sort

Approach: Keep two zones, a sorted and unsorted zone. “Insert” next element into sorted zone

  • keep left-side sorted
  • each loop:
    • shift next element left until sorted

Insertion Sort

Mid Search snapshot:

  • List has two “zones”:
    • Sorted Zone
      • contains original elements
      • re-ordered to be sorted
  • Unsorted Zone
    • contains original elements
    • has not been touched yet

Insertion Sort – strategies

Nested loops:

  • Outer loop repeats n-1 times, n being size of list (start at 1 because we are looking one index back)
  • Inner loop starts at current index for outer loop and goes back, checking each current element with previous until the beginning of the list or until a smaller value is found.

Swapping in place:

lst[j], lst[j-1] = lst[j-1], lst[j]

Insertion Sort – implementation 1

def insertion_sort(lst):
  for i in range(1, len(lst)):
    j = i
    while j > 0 and lst[j] < lst[j-1]:
        lst[j], lst[j-1] = lst[j-1], lst[j]
        j = j - 1

Insertion Sort – implementation 2

def insertion_sort(lst):
    for i in range(1, len(lst)):
        for j in range(i, 0, -1):
            if lst[j] < lst[j-1]:
                lst[j], lst[j-1] = lst[j-1], lst[j]

Insertion Sort runtime

What’s Insertion Sort’s runtime?

Insertion Sort runtime

What’s Insertion Sort’s runtime?

\(O(N^2)\)