CSCI 1913 – Introduction to Algorithms, Data Structures, and Program Development Adriana Picoral
What’s an algorithm
The word “algorithm” is in the public vernacular (commonly spoken, informal language), but it has a specific meaning in computer science and mathematics.
What is an algorithm?
Algorithm
Finite sequence of instructions (not code yet), typically used to solve a class of specific problems (such as search and sorting) or to perform a computation.
An algorithm can be expressed as a finite amount of space and time (worst case).
Social media recommender systems are often refered as “the algorithm”, but these systes rely on heuristics (there’s no “correct” recommendation).
Why?
import timedef run_big_list(): evn =list(range(0, 999999, 2))for i inrange(10000): _ = i in evndef run_big_set(): evn =set(range(0, 999999, 2))for i inrange(10000): _ = i in evnif__name__=="__main__": start_time = time.perf_counter() run_big_list() end_time = time.perf_counter() elapsed_time = end_time - start_timeprint(f"Time to run_big_list(): {elapsed_time:.4f} seconds") start_time = time.perf_counter() run_big_set() end_time = time.perf_counter() elapsed_time = end_time - start_timeprint(f"Time to run_big_set(): {elapsed_time:.4f} seconds")
Time to run_big_list(): 11.4340 seconds
Time to run_big_set(): 0.0102 seconds
“correct” vs “good”
What does it mean when we say code is “correct”?
How can code be correct but not good?
How do we compare correct code?
Software Development
Level 1: “direct translation”
Specific Problem → Specific Code
Software Development
Level 2: “problem solving”
Problem Description → Specific Problem → Pseudo-code → Specific Code
Where I want you today
Skills won through hard practice
Starting to see connections and re-use tasks
Language agnostic
Software Development
Level 3: “generalization”
Problem Description → Specific Problem → Generic Problem → General Algorithm → Pseudo-code → Specific Code
Where we’re going
Many problems are old friends
Big problems are made of little familiar problems
Software Development
Level 1: add up all elements in an array
Level 2: find the prime numbers in an array
Level 3: sort an array, use fewer than \(n^2\) comparisons
Algorithm
Generic Problem: A formally specified problem or task.
Algorithm: An ordered series of computations to accomplish a given task (This is purely a mental/conceptual thing.)
Code: A specific text file containing a specific realization of an algorithm in a specific programming language
CSCI 1913/1933 - Introduction to Algorithms and Data
CSCI 3041 Introduction to Discrete Structures and Algorithms
CSCI 4041 Algorithms and Data Structures
CSCI 5421 Advanced Algorithms and Data Structures
Quiz 03
You have 10 minutes to complete the quiz
No need for comments or doc strings, no need to include the tests in your solution, no need for if __name__. Just write your function and what’s inside the function
HINTS:
remember there are ways to iterate over a list: through its index or through its elements, only one of these work to change the list
Our Focus
Structures
Searching
Sorting
“organization” – algorithms for data structures
Sensitivity to speed
Exercise
Problem: find largest value in an unsorted collection
def get_max(*values): current_max =Nonefor v in values:if current_max ==Noneor v > current_max: current_max = vreturn current_max
Analysis
What’s the best case? Empty collection (constant time)
What’s the worst case? N
def get_max(*values): current_max =Nonefor v in values:if current_max ==Noneor v > current_max: current_max = vreturn current_max
Runtime
import timedef get_max_tuple(*values): current_max =Nonefor v in values:if current_max ==Noneor v > current_max: current_max = vreturn current_maxdef get_max(values): current_max =Nonefor v in values:if current_max ==Noneor v > current_max: current_max = vreturn current_maxif__name__=="__main__": values =range(99999999) start_time = time.perf_counter() get_max_tuple(tuple(values)) end_time = time.perf_counter() elapsed_time = end_time - start_timeprint(f"Time to get_max_tuple: {elapsed_time:.4f} seconds") start_time = time.perf_counter() get_max(list(values)) end_time = time.perf_counter() elapsed_time = end_time - start_timeprint(f"Time to get_max list: {elapsed_time:.4f} seconds") start_time = time.perf_counter() get_max(set(values)) end_time = time.perf_counter() elapsed_time = end_time - start_timeprint(f"Time to get_max set: {elapsed_time:.4f} seconds")
Time to get_max_tuple: 1.1728 seconds
Time to get_max list: 2.8842 seconds
Time to get_max set: 5.4027 seconds
Asymptotic Runtime
Asymptotic means “approaching a limit”
It doesn’t make sense to use computing runtime to compare implementations/algorithms because there are too many variables like processor speed, assembly code efficiency, etc.
Literal running time is NOT helpful.
Goal IS NOT to predict wall clock (actual, elapsed real-world time)
Goal IS NOT to evaluate implementations
Asymptotic Runtime
We deal with approximations for comparisons
Usually if we care about runtime, we are interested in scalability: how much more time will it take if I double the input size?
What is Big-O?
Big-O notation is a way of quantifying the rate at which some quantity grows.
A tool for comparing algorithms and predicting performance
O stands for order of, N is the input size
We ignore constants and smaller terms because we care about what happens when n gets really large
We calculate Big-O based on worst case scenario
Big O
An algorithm with O(N) complexity means its runtime grows linearly with the input size (N).
How much more time will it take if I double the input size for a O(N) algorithm?
Twice as long. Finding max value of an array of size 12 takes twice as long as finding max value of an array of size 6 (double the input, double the time complexity).