intro to sets (class slides)

CSc 110 - Intro to Sets

Data Structures in Python

  • lists (mutable)
  • dictionaries (mutable)
  • tuples (immutable)
  • sets (mutable)

Lists

  • store:
    • ordered items (it keeps the original order)
    • elements of different types
    • repeated elements
  • use square brackets ([]) syntax for creating a list
  • allow indexing (integers starting at 0) with square brackets ([]) as well
  • are mutable

Tuples

  • store:
    • ordered items (it keeps the original order)
    • elements of different types
    • repeated elements
  • use parentheses (()) syntax for creating a tuple
  • allow indexing (integers starting at 0) with square brackets ([])
  • are immutable

Dictionaries

  • store pairs of items key: value
  • keys have to be unique (no repeated keys allowed)
  • use curly brackets ({}) with key and value separated by colon (:)
  • allow value retrieval through key with square brackets ([])
  • are mutable

Sets

  • A set is (another) data structure

  • Helpful ways of thinking about it

    • A “bag” of elements

Sets

  • Do not store items in the order you created them

  • Do not allow repeated items to be stored

  • Cannot use square brackets to retrieve an item

  • Use set() to initiate an empty set

  • Use len() to get number of items

  • Use for x in set to iterate over the elements in a set

Set methods

  • .add(value) adds an element to the set
  • .discard(value) discards the specified value

You can also use the operator in and the function len() with sets

Use the function set() to convert a list to a set, and list() to convert a set into a list

Evaluate the code

numbers = {5, 7, 100, 5, 3, 5, 9, 8, 20, 5}
numbers  # evaluate what this set holds at this point

numbers.add(30)
numbers  # evaluate what this set holds at this point

numbers.discard(10)
numbers  # evaluate what this set holds at this point

Evaluate the code

numbers = {5, 7, 100, 5, 3, 5, 9, 8, 20, 5}
numbers
{3, 5, 7, 8, 9, 20, 100}
numbers.add(30)
numbers
{3, 5, 7, 8, 9, 20, 30, 100}
numbers.discard(10)
numbers
{3, 5, 7, 8, 9, 20, 30, 100}

Evaluate the code

numbers_list = [2, -1, -2, 1, 3, 4, 1, 2]
numbers_list # evaluate this line

numbers_set = set(numbers_list)
numbers_set # evaluate this line

numbers_set.add(1)
numbers_set.add(2)
numbers_set # evaluate this line

numbers_set.add(5)
numbers_set # evaluate this line

numbers_set.discard(6)
numbers_set.discard(1)
numbers_set # evaluate this line

Evaluate the code

numbers_list = [2, -1, -2, 1, 3, 4, 1, 2]
numbers_list # evaluate this line
[2, -1, -2, 1, 3, 4, 1, 2]
numbers_set = set(numbers_list)
numbers_set # evaluate this line
{-2, -1, 1, 2, 3, 4}
numbers_set.add(1)
numbers_set.add(2)
numbers_set # evaluate this line
{-2, -1, 1, 2, 3, 4}
numbers_set.add(5)
numbers_set # evaluate this line
{-2, -1, 1, 2, 3, 4, 5}
numbers_set.discard(6)
numbers_set.discard(1)
numbers_set # evaluate this line
{-2, -1, 2, 3, 4, 5}

Quiz 10

current time

You have 10 minutes to complete the quiz

  • No need for comments, no need for a main(), no test cases

Built-in functions you can use: round(), input(), float(), str(), int(), len(), range() — you don’t have to use all of these

Write a function

  1. Its name is count_names
  2. It opens in read mode a file that contains a name per line, with repeated names
  3. It counts how many unique names there are in the file
  4. It returns an integer with the count
  5. Use a set for this
assert count_names("names.txt") == 11

Write a function – solution

def count_names(file_name):
  f = open(file_name, "r")
  name_set = set()
  for line in f:
    name_set.add(line.strip())
  f.close()
  return len(name_set)

def main():
  assert count_names("names.txt") == 11
  
main()

Write a function

  1. Its name is has_duplicates
  2. It takes a variable number of arguments: *values
  3. It returns True if there are repeated elements in values, False otherwise
assert has_duplicates(3, 2, 1, 2, 3) == True
assert has_duplicates(3, 20, 1, 2, 13) == False

Write a function – solution

def has_duplicates(*values):
  unique_values = set()
  for v in values:
    if v not in unique_values:
      unique_values.add(v)
    else:
      return True
  return False

def main():
  assert has_duplicates(3, 2, 1, 2, 3) == True
  assert has_duplicates(3, 20, 1, 2, 13) == False
  
main()

Write a function – solution 2

def has_duplicates(*values):
  values_set = set(values)
  return len(values) > len(values_set)

def main():
  assert has_duplicates(3, 2, 1, 2, 3) == True
  assert has_duplicates(3, 20, 1, 2, 13) == False
  
main()