Sets

CSCI 1913 – Introduction to Algorithms, Data Structures, and Program Development
Adriana Picoral

Sets

  • A set is a mutable object
  • Contains nonduplicate elements
  • unordered datastructure
  • Type name set
  • literal uses curly braces ({})
  • Only contains immutable elements
  • We can use built-in functions like len(), min(), max(), sum()

Methods

  • .add(value) adds an value to the set
  • .remove(value) removes an value from the set (throws error if no matching value found)
  • .discard(value) removes an value from the set (no error if no matching value found)

Since items in a set are not ordered, not sequential, there’s no index in sets.

Sets

Items in a set are not ordered, and cannot repeat.

my_set = {"apple", "banana", "pear", "kiwi", "kiwi"}
print(my_set)
{'kiwi', 'banana', 'apple', 'pear'}
my_set.add("grape")
print(my_set)
{'banana', 'kiwi', 'apple', 'pear', 'grape'}
my_set.remove("banana")
print(my_set)
{'kiwi', 'apple', 'pear', 'grape'}

Set methods

  • .add(value) adds an item to the set, if the value is already there, nothing happens (it changes the set)
  • .union(other_set) returns a new set containing all unique elements from both sets
  • .intersection(other_set) returns a new set containing only the elements common to both sets
  • .difference(other_set) returns a new set containing elements present in one set but not in other_set

Exercise (by Mario)

Ecological Survey

You are working with a team of biologists, and each of you have split up to survey different areas of a rainforest. Each biologist makes a list of animals that they spotted during the survey. You have collected everyone’s list and you want to know which animals are common to all of the areas explored.

Task: Write a function called commonality() which takes any number of lists and returns a single set containing the animals that were found in all of the lists provided.

Exercise (by Mario)

Your commonality() function takes any number (use * before parameter name) of lists and returns a single set containing the animals that were found in all of the lists provided. Submit your commonality.py solution to gradescope.

Test case:

dense_forest = ["howler monkey", "sloth", "chipmunk", "toucan", "leafcutter ant", "sloth", "tarantula", "pit viper"]
cave = ["pit viper", "bat", "leafcutter ant", "pit viper", "tarantula", "tadpole"]
volcano = ["tarantula", "parrot", "pit viper", "leafcutter ant", "tarantula", "grasshopper"]

common_animals = commonality(dense_forest, cave, volcano)

assert common_animals == {'leafcutter ant', 'tarantula', 'pit viper'}

Reflection: Why might sets be more useful than lists for this task?

Solution

def commonality(*lists) -> set:
    """Returns a set containing items that all list arguments have in common."""
    if len(lists) == 0:
        return set()
      
    # start with first set
    common = set(lists[0])
    
    # for all other sets
    for i in range(1, len(lists)):
        set_i = set(lists[i])
        # find the intersection of common set with current set
        common = common.intersection(set_i)
    return common

Why sets?

  • Python sets are implemented using hash tables
  • Faster membership check, faster insertion, faster deletion