Today we developed a recursive function for generating subsets that has a similar structure to the recursive functions for generating sequences and generating permutations we saw on Monday. These functions fit within a category of algorithms known as recursive backtracking. I introduced the general pattern of choose-explore-unchoose as a way to understand the structure of these algorithms. Together we drew pictures, traced in the debugger, and experimented with the code to help strengthen our understanding of how such code operates.

Subsets

The power set of set S is the set of all possible subsets of S. If the input contains the single element C, there are two possible subsets, one including C and the other empty:

{C}  {}

If the input contained element B in addition to C, we will also consider whether to include B. This power set contains four subsets:

{BC}  {B}  {C}  {}

With a third element A under consideration, the power set has eight subsets:

{ABC}  {AB}  {AC}  {A}  {BC}  {B}  {C}  {}

Note how the size of power set doubles as we add an element. To construct the size N power set, we build on the N-1 power set and try adding the new element or not to each.

We drew up a decision tree for generating subsets. Each level of the tree corresponds an element from the input that is being considered. The possible options for the element are to either include it in the current subset or not.

In the diagram below, each left arm in the tree indicates the option to include the current element, the right arm is without. Each path from the top to the bottom represents a sequence of recursive calls that has reached the base case. That path is one subset. You can determine which elements are contained in that subset by tracing the sequence of yes/no turns it takes.

                                 |
                        +--------+---------+
                        |                  |   
                      A? yes              A? no
                        |                  |
                 +------+------+     +-----+------+
                 |             |     |            | 
               B? yes      B? no    B? yes      B? no

Did you notice that the decision tree for subsets is structurally similar to the decision tree for coin-flipping? Each decision point has two options. The total number of paths to explore is 2^N.

As before, the self-similarity leads to a very compact recursive solution:

void listSubsets(string input, string sofar)
{
    if (input.empty()) {
        cout << sofar << endl;
    } else {
        char consider = input[0];
        string rest = input.substr(1);
        listSubsets(rest, soFar + consider);  // explore with
        listSubsets(rest, soFar);             // explore without
}

Tracing down and back

I mentioned that it is tempting to think that that recursion traverses the tree level-by-level or left to right, but this is not correct. I sketched how the path goes down to the far left, around the bottom, and back up the right side of each subtree.

The first path explored is the one that goes all the way left. At the base case, it prints that subset and control returns to the previous decision point, where the previous decision is undone and the other option is tried.

Many students can follow how the recursion moves downward in the tree, but are perplexed at how backtracking moves upwards. As we traverse downward, we move toward the base case. When we arrive at the base case, there is no further exploration possible from here. Control should now return to the previous decision point to explore other alternatives.

We set a breakpoint on the base case so we could stop in the debugger at the bottom of the tree. We looked at the call stack to see the sequence of recursive calls so far. Each stack frame represents a decision point on the path from the start to here. Those stack frames waiting on the call stack are the “memory” of how we got here. When we reach the base case, we will pop its stack frame from the call stack and uncover the previous stack frame, which becomes the new topmost frame. Execution resumes in that frame and we pick up where we left off.

Understanding how/when/why the code backtracks to a previous decision point is perhaps the trickiest part of all in recursive backtracking. I highly recommend that you sketch a decision tree and walk through its traversal and/or step in the debugger to confirm your understanding of how it moves up and down the tree. I put an exercise in the HW4A debugger task about using step in/out/over that may be useful to try here.

Choose-explore-unchoose

This choose-explore-unchoose structure is a classic pattern for recursive backtracking. Here it is summarized in pseudocode:

void explore(options, soFar)
{
    if (no more decisions to make) {
        // base case
    } else {
        // recursive case, we have a decision to make
        for (each available option) {
            choose (update options/sofar)
            explore (recur on updated options/sofar)
            unchoose (undo changes to options/sofar)
        }
    }
}

I may have misspoken in lecture, as I did not mean to suggest that every recursive problem fits the above pattern. I should be more clear that the above structure is common specifically to recursive backtracking algorithms.

The details in the pseudocode are intentionally vague, e.g. what it means to “update options/sofar” or what is meant by “each available option”. These details are specific to a particular search space. If you apply the general pattern to generating sequendes of coin flips, the concrete details become:

State is length of sequence to generate, and sequence so far assembled
The decision is what next flip to add to sequence
Available options are H and T
Update state by adding flip to current sequence, decrement length
No explicit unchoose needed

Can you apply the general pattern to the letter sequences, permutations, and subsets?

Jumbled states

I put up the code for a mysterious explore function. I asked the students to spend some time looking over the code and talking with their neighbors.

// Try to puzzle out what this function does and what it prints.
// Consider: What decision does it make?
//           What options are available for the decision?
//           What "state" is being maintained during the exploration?
//           How does it update the state before recursing?
//           When does exploration stop?
//           When and what does it print?
void explore(Lexicon& lex, int nchoices, Set<string>& options, string soFar = "")
{
    if (nchoices == 0) {
        if (lex.contains(soFar)) {
            cout << soFar << endl;
        }
    } else {
        for (string choice: options) {
            explore(lex, nchoices - 1, options, soFar + choice);
        }
    }
}

As a class, we worked out that the loop chooses a string from a set of options. The chosen string is appended to the current sequence and recurses. After making a total of N such choices, it prints the sequence if it forms a valid word.

We ran the call explore(lex, 3, state_postal_codes) which printed these words:

ALMOND
CANDOR
INLAND
LARINE
MARINE

This call prints words that can be formed by concatenation of state postal codes!

We spent the rest of the lecture playing with this code. A student asked why the function only found words of exactly 3 postal codes, not 2 or 4. This is because it tested the sequence for being a word in the base case which is only entered after having chosen exactly the maximum. We moved the test for is word out of the base case so it would report words of fewer postal codes as well.

void explore(Lexicon& lex, int nchoices, Set<string>& options, string soFar = "")
{
    if (lex.contains(soFar)) {
        cout << soFar << endl;
    }
    if (nchoices > 0) {
        for (string choice: options) {
            explore(lex, nchoices - 1, options, soFar + choice);
        }
    }
}

We ran the code after this change and the list now included shorter words such as OR and CANE.

I added a return after the cout statement, so that when it found a word it would print and return without atttempting to further extend the sequence. I asked the class how this would change the output. We agreed that it would not find any longer words that contained a prefix that was also a word and confirmed that by running the program. We deleted that return statement so we found both the prefixes and the longer extensions.

I asked whether the function would permit reuse of a postal code. There was a brief dissension but we looked closely at how options is not changed throughout the function. We concluded all postal codes are considered at each decision point, regardless of any earlier decision. We confirmed we had duplicate use such as NENE in the output.

I mentioned that a tree with 50 options for each decision has a lot of paths to explore. Even just 4 choices leads to a whopping 6 million paths. I showed how we could add a pruning step to stop short on dead ends. For example, let’s say we’ve assembled “CATX” so far, which is not looking promising. If we were to know that our dictionary contains no words that start with “CATX”, there is no point in trying to extend with more postal codes, we are never going to construct a word. The Lexicon ADT has a function containsPrefix that reports whether if any words start with a given prefix. I used it here to detect when we have a non-viable prefix and thus not explore further on this path:

if (!lex.containsPrefix(soFar)) {
    return;
}

This doesn’t change what words are found, but it does speed up the function as it not spending time futilely exploring paths that can’t lead to words. I mentioned that this check was introduced as a form of pruning, but it was also acting as a kind of base case.

Lastly, I removed the logic that stopped at a maximum number of postal codes. Below is the version we ended on:

void explore(Lexicon& lex, Set<string>& options, string soFar = "")
{
    if (!lex.containsPrefix(soFar)) {
        return;
    }
    if (lex.contains(soFar)) {
        cout << soFar << endl;
    }
    for (string choice: options) {
       explore(lex, options, soFar + choice);
    }
}

I ran the code and we saw a few words of length 8 were now found in addition to the shorter words we saw previously.

Next time, we can talk through how this code is working despite no longer have any explicit base case!

The inspiration for preparing these notes was the work of my colleague Stuart Reges at University of Washington.