Searching

CSCI 1913 – Introduction to Algorithms, Data Structures, and Program Development
Adriana Picoral

Search Problem

Given an unsorted collection and a value, return the index of the value if value is found in collection, return -1 if value is not in collection.

The goal of this exercise is to think about the specific problem and how to solve it. Remember this:

Problem Description → Specific Problem → Pseudo-code → Specific Code

Don’t use .index(), actually think about how to solve this problem.

Why?

You might be asking yourself “why do I have to implement things that were already implemented by others?”

  • Using a built-in won’t always work
  • Someone builds the built-ins. Might be you.
  • In-depth understanding leads to creative flexibility

Creativity is built on top of understanding the basics

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

  • index 0
  • compare value at index with search value
  • not the same
  • move to next index

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

  • index 1
  • compare value at index with search value
  • not the same
  • move to next index

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

  • index 2
  • compare value at index with search value
  • not the same
  • move to next index

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

  • index 3
  • compare value at index with search value
  • not the same
  • move to next index

Understanding the problem

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 32

  • index 4
  • compare value at index with search value
  • same value!
  • return index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 0
  • compare value at index with search value
  • not the same
  • move to next index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 1
  • compare value at index with search value
  • not the same
  • move to next index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 2
  • compare value at index with search value
  • not the same
  • move to next index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 3
  • compare value at index with search value
  • not the same
  • move to next index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 4
  • compare value at index with search value
  • not the same
  • move to next index

Another example

[ 47 | 23 | 30 | 13 | 32 | 44 ]

Value to search: 3

  • index 5
  • compare value at index with search value
  • not the same
  • move to next index
  • no next index, return -1

Solution

Note the return statement inside the loop:

def linear_search(array, value):
  for i in range(len(array)):
    if array[i] == value:
      return i
  return -1
  
if __name__ == "__main__":
  numbers = [10, 2, 4, 11, 5]
  print(linear_search(numbers, 4))
  print(linear_search(numbers, 9))
2
-1

Algorithm Analysis – Linear Search

  • Let N be the length of the input list
  • Assume the element is not found
  • How many times do we need to access?
def linear_search(array, value):
  for i in range(len(array)):
    if array[i] == value:
      return i
  return -1

Algorithm Analysis – Linear Search

  • Let N be the length of the input list
  • Assume the element is not found
  • How many times do we need to access?
def linear_search(array, value):
  for i in range(len(array)):
    if array[i] == value:
      return i
  return -1

N times – one to check each position, for all positions

Linear Search is O(N) time complexity.

Binary Search

  • Binary Search is a specialized algorithm for the search algorithm
  • Extra Pre-condition: list is sorted smallest to highest
  • Substantial speedup
  • Core idea: inspecting one location elminates the need to look into other locations

Binary Search

  • Any time we access an element of array, check whether element is:
    • Too big (no need to look right of this element)
    • Too small (no need to look left of this element)
    • What we are looking for
  • As this goes we can “bound” the search area:
    • low index - the lowest possible index for elem
    • high index - the highest possible index for elem
    • Each access should update either our low index or our high index

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 0
  • high 9 (last index)
  • mid 4 (low+high then divide by 2)

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 0
  • high 9
  • mid 4 (low+high then divide by 2)

60 is greater than 47

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 0
  • high 9
  • mid 4 (low+high then divide by 2)

60 is greater than 47, updated low to mid + 1

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 5
  • high 9
  • mid 7 (low+high then divide by 2)

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 5
  • high 9
  • mid 7 (low+high then divide by 2)

60 is smaller than 74

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 5
  • high 9
  • mid 7 (low+high then divide by 2)

60 is smaller than 74, update high to mid - 1

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 5
  • high 6
  • mid 5 (low+high then divide by 2)

60 is greater than 49

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 5
  • high 6 (total length)
  • mid 5 (low+high then divide by 2)

60 is greater than 49, update low to mid + 1

Binary Search – example

[ 15 | 27 | 32 | 37 | 47 | 49 | 60 | 74 | 76 | 78 ]

Value to search: 60

  • low 6
  • high 6
  • mid 6 (low+high then divide by 2)

60 is equal to 60, return

Binary Search

What happens when value is not in collection? (not found?)

Binary Search

Implement binary search in python. Write a function called binary_search that takes in two arguments: a sorted list (increasing values) and a search value.

Your function should return the index of the search value if found, -1 if value is not found.

Test case:

numbers = [0, 5, 10, 23, 41, 43, 44, 60, 99, 120, 343]
assert binary_search(numbers, 99) == 8
assert binary_search(numbers, 9) == -1

Submit your binary_search.py solution to gradescope.

Binary Search – solution

def binary_search(arr, v):
  low = 0
  high = len(arr)-1
  while low <= high:
    mid = (low+high) // 2
    if v > arr[mid]: # update low
      low = mid + 1
    elif v < arr[mid]: # update high
      high = mid - 1
    else:
      return mid
  return -1

if __name__ == "__main__":
  numbers = [0, 5, 10, 23, 41, 43, 44, 60, 99, 120, 343]
  assert binary_search(numbers, 99) == 8
  assert binary_search(numbers, 9) == -1

Binary Search

What’s the best case scenario?

What’s the worst case scenario?

Binary Search

What’s the best case scenario? Search value is at mid index

What’s the worst case scenario? Search value is the last checked index

Binary Search

Since every comparison results in eliminating the need to check half of the list:

  • After 1 comparison: N/2 elements remain
  • After 2 comparisons: N/4 elements remain
  • After 3 comparisons: N/8 elements remain
  • After 4 comparisons: N/16 elements remain
  • After 5 comparisons: N/32 elements remain

Binary Search

Since every comparison results in eliminating the need to check half of the list:

  • After 1 comparison: N/2 elements remain
  • After 2 comparisons: N/4 elements remain
  • After 3 comparisons: N/8 elements remain
  • After 4 comparisons: N/16 elements remain
  • After 5 comparisons: N/32 elements remain
  • After k comparisons: \(**N/2^k**\) elements remain

Binary Search

After k comparisons: \(**N/2^k**\) elements remain.

What’s the value of k?

\(N = 2^k\)

\(k = log_2(N)\)

O(log N)

What happens when we double the input (2N)?

Binary Search

What happens when we double the input size (from N to 2N)?

Doubling the input size only adds one more step to the worst case scenario.

Binary Search vs. Linear Search

Linear Search time complexity is O(N), Binary Search is O(log N).

O(N) vs. O(log N)