Lecture 6: The Set and Map Classes

CS 106B: Programming Abstractions

Fall 2024, Stanford University Computer Science Department

Lecturers: Cynthia Bailey and Chris Gregg, Head CA: Jonathan Coronado

An image of a set, with a box around many circles, each with a letter inside, and an image of a map, with many circles, each with an individual arrow to another circle.


Slide 2

Announcements

  • Lecture quizzes are only necessary if you don't attend lecture! If you check in to lecture via the link, you do not have to do the lecture quiz for that day.
  • Please do not post screenshots to Ed unless absolutely necessary. See this post.
    • When you post a screenshot (a bunch of colored pixels), you lose all information about the underlying text. It is much easier to read text (particularly on a phone), and if someone is answering the post, they can copy/paste from text.
    • Yes, there are some cases where a screenshot is helpful (if you're trying to figure something out in the debugger, or in Qt Creator in general). But, if it's just code, it should not be a screenshot.
  • Assignment 1 is due Monday, end-of-day PDT.
  • Assignment 2 will be out Monday.
  • Assignment 2 YEAH hours will be next Tuesday

Slide 3

Sets and Maps

  • Today we are going to discuss two new collections: sets and maps.
  • A set is a collection of elements with no duplicates: An image of a set, with a circle surrounding many words, none of which are duplicates
  • A map is a collection of key / value pairs, and the key is useed to find the value. In python, we call this collection a dict.

    An image of a map, with two circles, the first holding the keys 'Chris', 'Jenny', 'Cynthia', and 'Jonathan', with arrows going to a second circle with phone numbers for each name.


Slide 4

Sets

  • A set is a collection of elements with no duplicates. Sets have (at least) the following operations (and they are fast):
    • add(value): adds a value to a set, and ignores if the set already contains the value
    • contains(value): returns true if the set contains the value, false otherwise.
    • remove(value): removes the value from the set. Does nothing if the value is not in the set.
    • size(): returns the number of elements in the set
    • isEmpty(): returns true if the set is empty, false otherwise.
    • Sets have other useful functions, and you should check the Set documentation to see them.
  • Sets do not have indexes!

A set with many words, including "the", "if", "to", "by", but not "be". The image has set.contains("to"), which returns "true", and set.contains("be"), which returns false


Slide 5

Sets: simple example

Set<string> friends;
friends.add("cynthia");
friends.add("chris");
friends.add("jonathan");
cout << boolalpha << friends.contains("voldemort") << 
     << noboolalpha << endl;
for(string person : friends) {
    cout << person << endl;
}

Output:

false
chris
cynthia
jonathan
  • Notice – the output from the for loop is alphabetical! Sets keep their values sorted.

Slide 6

Looping over a Set

for(type currElem : set) {
    // process elements one at a time
}
  • You can't use an index-based for loop, becuase Sets do not have indexes!
    for(int i=0; i < set.size(); i++) {
      // does not work, no index!
      cout << set[i];
    }
    

    Homer Simpson setting a bowl of cornflakes on fire


Slide 7

Types of Sets

  • There are actually two types of sets in the Stanford library:
    • Set
      • Iteration over the elements is in sorted order
      • Really fast access times! O(log n) per retrieval – we will learn about this next week!
      • Sets are implemented using a "binary search tree"
    • HashSet
      • Iteration over the elements is not in a useful order (it is jumbled)
      • Really, ridiculously fast!
      • O(1) per retrieval (again, we will learn what this means next time!)

Slide 8

Set Operands

  • Sets can be compared, combined, etc.
    • s1 == s2
      true if the sets contain exactly the same elements
    • s1 != s2
      true if the sets don't contain the exact same elements
    • s1 + s2
      returns the union of s1 and s2 (i.e., all elements in both)
    • s1 * s2
      returns the intersection of s1 and s2 (i.e., only the elements in both sets)
    • s1 - s2
      returns the difference of s1 and s2 (the elements in s1 but not in s2)

Slide 9

Counting Unique Words

  • Let's write a program to count the unique words in a file The Qt Creator logo

Slide 10

Maps

  • A map is a collection of pairs (k, v), sometimes called key/value pairs, where v can be found quickly if you know k.
  • Other terms you may hear for a map are dictionary or associative array.
  • A map is a generalization of an array, where the "indexes" don't need to be integers: Key Value pairs, with keys on the left ("Chris", "Jenny", "Cynthia", and "Jonathan") and their corresponding values on the right ("866-2233", "867-5309", "685-6232", and "488-0312" respectively

Slide 11

Using Maps

  • A map allows you to get from one half of a pair to the other.
    • E.g., to store an association from "Jenny" to "867-5309":
      //  key         value
      m["Jenny"] = "867-5309"; // or:
      m.put("Jenny", "867-5309");
      
    • To get Jenny's number:
      string ph = m["Jenny"] // or
      string ph = m.get("Jenny")
      cout << ph << endl;
      

      Output:

      867-5309
      

Slide 12

Maps are Everywhere

  • Wikipedia: the key is the title, the value is the article:

Two wikipedia articles, "Yosemite National Park" and "Mariana Trench"


Slide 13

Creating Maps

  • A Stanford Map requires two parameters: one for keys, one for values:
    // maps from string keys to integer values
    Map<string, int> votes;
    
    // maps from string keys to Vector<string> values
    Map<string, Vector<string>> friendMap; 
    

Slide 14

Map Functions

  • The following functions are part of the Map class:
    • m.clear() : removes all key/value pairs from the map
    • m.containsKey(key) : returns true if the map contains a value for the given key
    • m[key]
      m.get(key) : returns the value mapped to the given key; m[key] only: if key is not in the map, adds it with the default value (e.g., 0 or ""); m.get(key) only: if key is not in the map, returns the default value for the value type, but does not add it to the map.
    • m.isEmpty() : returns true if the map contains no key/value pairs (size 0)
    • m.keys() : returns a Vector copy of all keys in the map
    • m[key] = value
      m.put(key, value) : adds a mapping from the given key to the given value; if the key already exists, replaces its value with the given one
    • m.remove(key) : removes any existing mapping for the given key (ignored if the key doesn't exist in the map)
    • m.size() : returns the number of key/value pairs in the map
    • m.toString() : returns a string such as "{a:90, d:60, c:70}"
    • m.values() : returns a Vector copy of all the values in the map

Slide 15

Map Example

Map<string, string> wiki;

// adds name / text pair to dataset
wiki.put("Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo", articleHTML);

// returns corresponding articleHTML
cout << wiki.get("Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo");

// removes the article
wiki.remove("Carbon dioxide in Earth's atmosphere");

A Wikipedia article about the phrase "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo"

A Wikipedia article about the removal of Great Britain from the U.K.


Slide 16

Types of Maps

  • There are actually two types of maps in the Stanford library:
    • Map
      • Iteration over the elements is in sorted order
      • Really fast access times! O(log n) per retrieval – we will learn about this next week!
      • Maps are implemented using a "binary search tree"
    • HashMap
      • Iteration over the elements is not in a useful order (it is jumbled)
      • Really, ridiculously fast!
      • O(1) per retrieval (again, we will learn what this means next time!)

Slide 17

Map Example: Tallying Votes

// tally votes:
// (M)ilk, (S)tokes, (R)ogers
string allVotes = "MMMRMSSMSSMMMMMRRMMMMRRRMMM";

Map<char, int> voteTally;
for (char v : allVotes) {
    voteTally[v]++;
}

// loop over the map
for (char initial : voteTally) {
    int numVotes = voteTally[initial];
    cout << initial << ": " << numVotes << " votes" << endl;
}
  • Why does voteTally[v]++; work? It turns out that when you access a Map's value, you get back a reference! Therefore, updating it here allows it to update inside the map. Cool!
  • It is equivalent to the following:
    int& currentTotal = voteTally[v];
    currentTotal++; // update inside the map
    
  • Notice how we looped over the map – we only get the keys, and have to manually ask for each value from the key.

Slide 18

Tallying Words

  • Let's write a program to tally how many of each word is in a file

The Qt Creator logo