Graph Coding

Monday, August 11

We explore implementations of topological sort and Dijkstra's algorithm in all their glory!

🎬 Lecture video on Canvas [toggle visibility]
📎 graph-algorithms.zip

Contents

1. Overview and the WeightedGraph Class

2. Topological Sort Refresher, Example Graph, and Input File

3. Cycles and Topological Sort

4. Dijkstra's Algorithm Refresher, Example Graph, and Input File

5. What's next?

6. Exam Prep

Overview and the WeightedGraph Class

Attachment: graph-algorithms.zip

We started today with the implementation of a homebrewed WeightedGraph class. We then added implementations of topological sort and Dijkstra's shortest paths algorithm to the code, briefly reviewing the key concepts behind those algorithms along the way. See attached.

Topological Sort Refresher, Example Graph, and Input File

Recall that a topological sort is an ordering of vertices in a directed graph where we can only list some node, j, after we have listed all nodes i where there is an edge from i to j in the graph. The word "ordering" here refers to a permutation (or arrangement) of our vertices. A valid topological sort must contain every node in our graph exactly once. If any node is missing, or if a node is listed more than once, that's not a valid topological sort.

For example, in the graph below, node h has two incoming edges: one from b, and one from e. Accordingly, any valid topological sort for this graph must list nodes b and e before listing node h:

In class today, we fed this graph to our WeightedGraph constructor with the following input file:

topological-graph.txt

9
a
b
c
d
e
f
g
h
k
0 0 1 0 1 0 0 0 0
0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 1 0
0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0

One valid topological sort for the graph above is as follows:

b g k d a e h f c

If we rearrange these nodes into a straight line in that order (without adding or removing any edges), we see a sort of flow in which all edges point to the right, never the left:

(Important note!) A topological sort does not have to correspond to a valid path through the graph!

Cycles and Topological Sort

Note that if a graph has a cycle, then it has no valid topological sort. Consider, for example, the following graph. A topological sort of a graph must include every node in that graph exactly once, but there is no way for us to list nodes B, C, or D in this example because of the cyclical dependencies in that part of the graph:

After observing this today, we ensured our implementation of topological sort printed an error message if the graph it received had a cycle. That happened if, and only if, the while-loop terminated without adding all the nodes in our graph to our final topological sort sequence.

Dijkstra's Algorithm

We also implemented Dijkstra's algorithm today, focusing on this example graph from last week's slides:

In class today, we fed this graph to our WeightedGraph constructor with the following input file:

dijkstra-graph.txt

9
A
B
C
D
E
F
G
H
I
0 5 0 0 0 0 9 18 1
5 0 7 0 0 0 0 0 0
0 7 0 8 11 0 0 0 6
0 0 8 0 20 0 0 0 0
0 0 11 20 0 4 0 0 2
0 0 0 0 4 0 1 0 0
9 0 0 0 0 1 0 2 3
18 0 0 0 0 0 2 0 0
1 0 6 0 2 0 3 0 0

For ease of reference, the solution (using "A" as our source vertex) was as follows:

In class today, we coded up the more exotic implementation of this algorithm we had discussed last week: the one that allows garbage entries to accumulate in a minheap and leads to an O(n² log n) runtime. The O(n²) approach that relies on repeatedly looping through the dist vector to find the smallest value associated with an unvisited node is left to everyone as an exercise.

What's next?

Tomorrow (Tuesday) (and possibly Wednesday if we need more time), we'll do a course wrap where where I'll put a bow on some of the topics we've talked about this quarter, share some expectations for the final exam (and other end-of-quarter logistics), and talk a bit about what comes next in life after 106B.

On Friday, we have the final exam.

Exam Prep

Please don't stress out about today's practice problems! All the material we covered pertaining to topological sort and Dijkstra's algorithm last week is fair game for the exam, but I won't test you directly on anything related to the coding we did in class today that wasn't already expressed in class last week. I still think it's a good idea to glance through the exercises below, as there's some good food for thought here that might be worth reflecting on before the final exam (I'm thinking mostly of Exercises #6 and #7 as I say this), but the actual coding aspects of most of these exercises are primarily just here for you should you want some additional practice or a chance to strengthen your understanding of coding with graphs. If you don't have time to push through these this week, I encourage you to revisit them after the course has concluded.

1. Review the constructor for the WeightedGraph class in the file attached at the top of today's notes. What additional checks might we add to ensure that it always throws useful error messages if the graph input file it receives is not formatted according to expectations?

2. The WeightedGraph class uses an adjacency matrix where _matrix[i][j] == 0 indicates the absence of an edge from i to j. The problem with this approach is that we can't have an edge with a weight of zero in any of our graphs. What options do we have to get around that and allow edges of weight zero to exist in our graphs? What are some of the advantages and disadvantages of the various options you can think of? What implications might your proposed solution(s) have for our ability to use the WeightedGraph class to represent unweighted graphs, whose input matrices consist entirely of zeros and ones?

3. Add BFS and DFS functions to the WeightedGraph class and test that they work correctly on a variety of sample inputs. Be sure to code up DFS iteratively using a stack, as well as recursively. What are the runtimes for your implementations of BFS and DFS? How much space do they use in the worst case?

4. Try your hand at coding up both topological sort and Dijkstra's algorithm from scratch, without referring to the solutions included in the code attached above! Be sure to implement cycle detection in your implementation of topological sort and path recovery (i.e., printing the actual path taken, not just its cost) in your implementation of Dijkstra's algorithm.

5. Modify today's implementation of topological sort to choose a random node whenever there is more than one that could come next in the ordering it is constructing.

6. With topological sort, if we want to choose a random node from the options available to us at each iteration, we could just keep them in a vector (let's call that vector availableNodes), choose a random index on the range 0 through availableNodes.size() - 1, and remove the value at that index. With this approach, each iteration has a worst-case runtime of O(k) (where k is the size of our availableNodes vector), which happens when we select and remove the element at the beginning of the vector. How could you implement that operation in O(1) time regardless of the length of the vector, while still selecting a totally random element for removal out of all the options available to us in that vector?

7. Write a backtracking function that prints every possible topological sort for a graph exactly once (rather than simply printing the first topological sort it finds, as was the case with our function from class today).

8. Code up an O(n²) implementation of Dijkstra's algorithm that uses just the dist and visited vectors from the approach we saw in class today, and not a priority queue.