Introduction to Linked Lists

Monday, July 28

Today we introduce our first node-based linked data structure: the linked list!

📚 Readings: Text 12.2, 13.5
📝 Lecture quiz on Canvas
🎬 Lecture video on Canvas [toggle visibility]

Contents

1. Warning: Here Be Dragons🔥🐉

2. Preliminary Note: nullptr

3. Incredibly Important! Preliminary Note: Passing Pointers by Reference 🤯

4. Arrays and Linked Lists: Summary of Key Points

5. Linked List Visualization: The Basic Anatomy of a Linked List

6. Draw Diagrams and Embrace the Memory Addresses!

7. Coding Tidbit: The Arrow Operator (->)

8. First Draft: A Clunky Assemblage of a Linked List

9. Second Draft: A More Refined Approach to Node Creation (and a Print Function)

10. Third Draft: A More Elegant Assemblage of a Linked List (a Tail Insertion Function)

11. Deletion Functions (and Memory Leak Checks)

12. Cheeky Aside: Nodes That Hold Characters

13. What's next?

14. Exercises

Warning: Here Be dragons! 🔥🐉

I introduced linked lists today, and I started with a bit of a warning: these can be really tough when you first see them! They're quite sophisticated in that they involve a lot of pointer manipulation and dynamic memory management. I encourage you to dig in on the lecture notes, start your linked lists assignment early, re-watch today's lecture as needed, draw plenty of memory diagrams as you work with this topic, head to LaIR and office hours for help, and ask plenty of questions.

Perhaps most importantly: don't feel discouraged if you find this topic tricky. It's one of the harder things we'll cover in this class, and you are definitely not alone if you find yourself struggling with this material. Remember, the course staff understands that this can be tough, and we are here to support you, but we can only do that if you come ask us for help.

Preliminary Note: nullptr

Before diving into today's main topic (linked lists), I mentioned there is a special value in C++ that we can assign to a pointer to indicate that it isn't pointing anywhere useful: nullptr. We often use this when we create a pointer that we're not ready to have point to another variable yet, or to obliterate our copy of a memory address to some dynamically allocated space that we have since freed with delete. We can then check whether a pointer is null or not before dereferencing it. This is called "coding defensively." For example:

#include <iostream>
#include "console.h"
using namespace std;

int main()
{
   int *ptr = nullptr;

   // Suppose we have many lines of code here, and the logic is complex and twisty,
   // and we aren't quite sure at first glance whether ptr has ever been updated to
   // point to an actual integer. If not, we don't want to dereference it, because
   // that could lead to a segmentation fault and crash our program.
   //
   // No worries! We can just check whether its still nullptr, like so:

   if (ptr != nullptr)
   {
      cout << "Aha! It's safe to dereference ptr." << endl;
      *ptr = 50;
   }
   else
   {
      cout << "OoOOOOoh! We didn't dereference the pointer! Hooray!" << endl;
   }

   return 0;
}

output:

OoOOOOoh! We didn't dereference the pointer! Hooray!

(Important note!) Dereferencing a null pointer will cause a segmentation fault ("segfault" for short) and crash our program:

#include <iostream>
#include "console.h"
using namespace std;

int main()
{
   int *ptr = nullptr;

   // This will cause a segmentation fault!
   *ptr = 50;

   return 0;
}

output:

*** STANFORD C++ LIBRARY
*** The LinkedLists program has terminated unexpectedly (crashed)
*** A segmentation fault (SIGSEGV) occurred during program execution

    This error indicates your program attempted to dereference a pointer
    to an invalid memory address (possibly out of bounds, deallocated, nullptr, ...)

*** To get more information about a program crash,
*** run your program again under the debugger.

Preliminary Note: Passing Pointers by Reference 🤯

This topic is covered at timestamp 9:00 in today's lecture. (We lay the foundation at 9:00, and the reference is introduced at 19:35. (In between, we had some really awesome questions about passing pointers to functions. Thank you to everyone who asked questions in class today! This is complex material, and I super appreciate the questions that have come up about it!)

As we work with linked lists, we will have to write functions that can modify pointers that we pass to them. To do that, we will pass pointers by reference. To lead into that, I want to start with a bit of a simplified example. Suppose we have a pointer to an integer in main():

int main()
{
   int x = 50;
   int *ptr = &x;

   // ... presumably, we would do other things here ...

   return 0;
}

Memory Diagram: Here's what things look like before we return 0; in main(). (The memory addresses are made up, but the struggle is real.)

main():

   x     0xdec08
   +-----------+
   |    50     |
   +-----------+

   ptr   0x55824
   +-----------+
   |  0xdec08  |
   +-----------+

So far, so good?

Now, suppose we pass that pointer to a function that takes an int * parameter. If we do that, we will get an entirely new copy of that pointer in our other function. Changing its value there will not change the value of ptr back in main(). Check this out:

void illuminate(int *ptr)
{
   ptr = nullptr;
}

int main()
{
   int x = 50;
   int *ptr = &x;

   illuminate(ptr);

   // ... presumably, we would do other things here ...

   return 0;
}

Memory Diagram: Here's what things look like when we call illuminate(ptr) -- before executing the ptr = nullptr line.

illuminate():

   ptr
   +------------+
   |  0xdec08   | <-- illuminate() has its own local variable called ptr!
   +------------+     This is *not* the same variable as the one in main()!

main():

   x     0xdec08
   +-----------+
   |    50     |
   +-----------+

   ptr   0x55824
   +-----------+
   |  0xdec08  |
   +-----------+

Memory Diagram: Here's what things look after executing the ptr = nullptr line in illuminate().

illuminate():

   ptr
   +------------+
   |  nullptr   | <-- We only changed the local copy of nullptr.
   +------------+     This does *not* affect the copy back in main()!

main():

   x     0xdec08
   +-----------+
   |    50     |
   +-----------+

   ptr   0x55824
   +-----------+
   |  0xdec08  | <-- This remains unaffected!
   +-----------+

If we want to empower the illuminate() function to go back to main() and change the value inside that ptr variable, we need to pass it by reference. As we saw in our first lecture on pass-by-reference parameters, the way to create a reference is to inject an ampersand (&) in front of the parameter name in question:

// Super important note!
// By injecting an ampersand here, we turn this ptr variable into a reference to the
// pointer variable back in whatever function called this one. ptr is now an alias or
// synonym for the corresponding parameter back in our calling function. If we set
// ptr equal to something, we're actually changing the value contained in the
// corresponding variable back in our calling function.
void illuminate(int *&ptr)
{
   ptr = nullptr;
}

int main()
{
   int x = 50;
   int *ptr = &x;

   illuminate(ptr);

   // ... presumably, we would do other things here ...

   return 0;
}

Memory Diagram: Here's what things look like when we call illuminate(ptr) -- before executing the ptr = nullptr line.

illuminate():

   ptr
   +------------+
   |     🌀     | <-- Because we passed by reference, this is now just a portal to
   +------|-----+     the ptr variable back in main()! It no longer has its own copy
          |                   of the memory address in question (0xdec08)!
main():   +----------+
                     |
   x     0xdec08     |
   +-----------+     |
   |    50     |     |
   +-----------+     |
                     |
   ptr   0x55824     |
   +-----------+     |
   |  0xdec08  <-----+
   +-----------+

Memory Diagram: Here's what things look after executing the ptr = nullptr line in illuminate().

illuminate():

   ptr
   +------------+
   |     🌀     | <-- Nothing has changed here. This is still just a portal to the
   +------|-----+     ptr variable back in main()!
          |
          |
main():   +----------+
                     |
   x     0xdec08     |
   +-----------+     |
   |    50     |     |
   +-----------+     |
                     |
   ptr 0x55824       |
   +-----------+     |
   |  nullptr  <-----+   <-- This value changed! Since the ptr in illuminate() was just
   +-----------+             a reference (i.e., portal) to this pointer, changing ptr
                             in illuminate() actually changed the value of this pointer
                                                  back in main()!

(Key take-away!) If you have a pointer in some function foo(), and you want some function bar() to be able to change the value that the pointer in function foo() contains, you should pass that pointer to bar() by reference using a *& parameter.

(Important note!) Understanding the content in this section will be absolutely critical as we move through the quarter. If anything about this is unclear, please seek out clarification immediately. You can talk to other people in the class, your SL, or head to LaIR or office hours. We want to help you understand what's going on here and set you up for success moving forward!

(Side note...) We could also give illuminate() the power to change ptr back in main() by passing the function a "double pointer" (a pointer to a pointer: int **ptr), but that's a bit more complex. We'll just stick to pointer references for this quarter to keep things a bit more manageable.

Arrays and Linked Lists: Summary of Key Points

After those preliminary notes, and then peppered throughout the remainder of the class, I made some comments about the nature of arrays and linked lists and some of the comparative strengths and weaknesses of these data structures. We saw that the strengths and weaknesses of linked lists are effectively the inverse of the strengths and weaknesses of arrays. Here is a summary of those remarks:

Arrays
- Structure
  - Recall that the cells of an array are contiguous in memory, meaning they're stored right next to one another in one big chunk.
- Strengths
  - We have O(1) access to any cell in an array. When we access, say, array[3], C++ does a quick calculation behind the scenes: it knows the base memory address where the array is stored, and it adds to that however much memory it takes to move forward three cells from that address. The size of a cell is based on how many bytes are in the type of data we're storing in that array, and C++ knows that data type from our array declaration. The calculation and subsequent jump to that memory address is incredible fast and occurs in O(1) time. (When we access array[3], C++ does not start at the beginning of the array and loop forward three cells, one by one.)
  - We can binary search a sorted array, which is just so fantastic. We love ourselves an O(log n) runtime.
- Potential Weaknesses (depending on application)
  - We sometimes underestimate how large an array needs to be when we first create it. This is often the case with arrays underlying ADTs like vectors, stacks, and queues. If we create an array with an initial length of 10 but then add 11 elements to it, the insertion of that 11th element forces us to stop and allocate a new, larger array, then copy the contents of the old array into the new one, and free up the old array -- a slow, expensive, O(n) operation.
  - Conversely, we sometimes overestimate how large an array needs to be when we first create it. This can lead to wasted space in memory. This might not be a huge deal if we only have one such array in memory, but if we have several -- or if we're running several programs at once that waste memory unnecessarily -- the strain on system resources can actually impact the performance of our programs.
  - Maintaining a sorted array can be expensive. If we have a bunch of sorted elements in indices 0 through k in an array of length n (where k < n), adding a new smallest element at index 0 requires us to scooch over all k existing elements, one by one -- a slow, O(k) operation. (Adding a new largest element at index n, however, is a fast, O(1) operation -- provided the array still has an unused cell for us to occupy.)
  - Because the cells of an array must be contiguous in memory, it's possible that we could have trouble finding space for an array in a memory-constrained system where we technically have enough memory to hold all the elements, but have a situation where memory has become highly fragmented (i.e., we have a pocket of memory here and a pocket of memory there, and our total memory free across all those pockets or fragments is certainly enough to hold all the elements we want, but there isn't one contiguous chunk available for that purpose).
Linked Lists
- Structure
  - Whereas arrays are made up of cells, linked lists are made up of nodes.
  - Just as each cell in an array contains a single value, each node in a linked list contains a single value. Today, I focused on linked lists where each node held an integer, but we can code up our nodes to hold any type of data we want.
  - The nodes of a linked list are scattered throughout memory. Each node contains a memory address telling us where to go to find the next node in our list. In this way, the nodes are "linked" together (hence the named "linked list"). Journeying through a linked list is like following a trail of breadcrumbs through memory.
  - We bundle together these pieces of information (a data field and a next pointer) in node struct.
  - We refer to the first node in our linked list as the head of the list. We maintain a pointer to that node (often in a variable called head), and it serves as our primary (and often only) entry point to the list.
- Strengths
  - Linked lists grow dynamically to accommodate the number of elements they contain. We never have wasted space (in the sense of allocating more nodes than we need in our list), and our list never "fills up" in a way that requires an disruptive, expensive reallocation or copying-over of all our elements into new spaces.
  - Adding to the beginning of a linked list is a fast operation! Even if we have a huge linked list with n elements, adding a new element to the beginning of our list is simply a matter of creating a new node, setting its next pointer to the old head of our linked list, and setting the head to point to our newly created node. All of this can be done in O(1) time, regardless of how long our list is.
- Potential Weaknesses (depending on application)
  - Accessing the k^th element is a slow, O(k) operation (compared to the array's O(1) access of an arbitrary cell) because we have to start at the head of the linked list and loop forward through our nodes one by one. This means that adding nodes to the middle or end of our linked list is expensive. (The opposite was true of an array. Inserting toward the end of an array was fairly fast because it required us to scooch fewer elements over.) We will see later, however, that we can achieve fast insertion at the very end of a linked list, and there is a variant of linked lists that allows for quick insertion at other positions prior to the very end of the list, as well.
  - (Expansion on comment from lecture today.) Because of the linked nature of these lists, and because we do not maintain pointers that allow us O(1) access to the k^th element for all k, we cannot perform binary search on a traditional linked list. We're stuck with a search runtime that is linear in the worst case rather than logarithmic.
  - (Expansion on comment from lecture today.) The nodes take up more memory than the cells in an array. On most systems these days, a C++ integer takes up 4 bytes (32 bits) of memory. So, an array of n integers requires 4n bytes of memory. Each node in a linked list holds not only a 4-byte integer, but also a next pointer. On most systems these days, a C++ pointer takes up 8 bytes (64 bits) of memory. That means each node takes a total of 12 bytes of memory, and a linked list with n nodes requires approximately 12n bytes in total -- 3x the amount of space required for an array of the same length!
Overbranching Comments
- Just as we can use arrays to build up abstract data structures like vectors, stacks, and queues, we can also use linked lists to build up ADTs. The trade-offs here are as described above.

Linked List Visualization: The Basic Anatomy of a Linked List

In class, I ran a program that generated a linked list diagram, where the values (data) in the diagram were based on command line input in my terminal. We used this to explore the basic anatomy of a linked list and see how nodes are linked together in memory, despite the fact that they're not necessarily contiguous in memory like arrays are. Here, we see successively larger linked lists and explore how the nodes are linked together:

seansz ~/Desktop $ listmaker

   head = NULL  (empty list)

seansz ~/Desktop $ listmaker 87

 addr:    0xf9800
        +---------+
 data:  |   87    |
        +---------+
 next:  | nullptr |
        +---------+
           ^
         head = 0xf9800

seansz ~/Desktop $ listmaker 87 93

 addr:    0xf9800          0xf4d33
        +---------+      +---------+
 data:  |   87    |      |   93    |
        +---------+ ---> +---------+
 next:  | 0xf4d33 |      | nullptr |
        +---------+      +---------+
           ^
         head = 0xf9800

seansz ~/Desktop $ listmaker 87 93 12

 addr:    0xf9800          0xf4d33          0xc625e
        +---------+      +---------+      +---------+
 data:  |   87    |      |   93    |      |   12    |
        +---------+ ---> +---------+ ---> +---------+
 next:  | 0xf4d33 |      | 0xc625e |      | nullptr |
        +---------+      +---------+      +---------+
           ^
           head = 0xf9800

seansz ~/Desktop $ _

Despite the convention of drawing nodes in neat rows, connected by arrows, the nodes are scattered throughout memory. A more accurate diagram would look like the following, but this is chaotic and hard to read, so we never really depict linked lists this way:

seansz ~/Desktop $ listmaker 87 93 12 --chaos-mode

                                                                0x1197d
                                                              +---------+
                                                        data: |   87    |
                                                              +---------+
                                                        next: | 0xaf30c |
                                                              +---------+

                     0x7c264
                   +---------+
             data: |   12    |
                   +---------+
             next: | nullptr |
                   +---------+





                                                     0xaf30c
                                                   +---------+
                                             data: |   93    |
                                                   +---------+
                                             next: | 0x7c264 |
                                                   +---------+

   head = 0x1197d

seansz ~/Desktop $ _

Draw Diagrams and Embrace the Memory Addresses!

When you're writing code with linked lists, and especially when you're trying to debug linked list code, be sure to...

DRAW LOTS OF DIAGRAMS!

And include memory addresses in your diagrams (whether made up or copied from the output of your programs) to make things more concrete.

It's true that in the real world, we often see linked lists drawn as follows, without any memory addresses attached:

seansz ~/Desktop $ listmaker 87 93 12 --basic-mode

     +------+     +------+     +------+
     |  87  | --> |  93  | --> |  12  |
     +------+     +------+     +------+
        ^
        head

seansz ~/Desktop $ _

This is fine for a quick and dirty visualization, but including memory addresses in the diagrams we draw can be super helpful when debugging code that involves linked lists. (The diagram above also doesn't quite make the distinction that head is a separate variable in memory that just holds the address of the head node; it is a pointer and not an actual node itself.)

My personal opinion is that when we're first learning about linked lists, abstracting away too many details actually makes it harder to understand how they work and what's going on. Even though memory addresses might look intimidating at first, embracing the details and complexity of what's really happening in our nodes -- and not being afraid to print memory addresses to the screen, draw them in our diagrams, or keep track of them in the debugger -- will lead to a better understanding of linked lists and stronger coding competency.

Coding Tidbit: The Arrow Operator (->)

We have seen previously that if we want to access a field within a struct variable, we use the dot operator (.). Today, we saw that if we have a pointer to a struct, we can access its fields using the arrow operator (->). If we instead try to apply a dot operator to a struct pointer, C++ will give us an error; it won't even compile our program. (See below for examples.)

(Important note!) Instead of using the -> operator as in myStruct->myField, we could instead do something like this: (*myStruct).myField. Please never do that, though! That is very unconventional, involves more keystrokes than simply typing ->, and is considered by many to be poor style / poor form.

First Draft: A Clunky Assemblage of a Linked List

Finally, we dove into linked list code. We created a Node struct, and we then collaboratively worked through the code to construct a linked list in main() in a slow, clunky way:

#include <iostream>
#include "console.h"
using namespace std;

struct Node
{
   int data;
   Node *next;  // a pointer to a Node -- the next node in our list
};

int main()
{
   Node *head = nullptr;

   head = new Node;
   head->data = 10;
   head->next = nullptr;

   // Here's what we have so far:
   //
   // addr:    0xf9800
   //        +---------+
   // data:  |   10    |
   //        +---------+
   // next:  | nullptr |
   //        +---------+

   // Let's continue hooking up nodes manually. In class, we carefully explored how
   // the following lines constructed a valid linked list. (See today's lecture video
   // for details.)

   head->next = new Node;
   head->next->data = 25;
   head->next->next = nullptr;

   // We now have the following:
   //
   // addr:    0xf9800          0xf4d33
   //        +---------+      +---------+
   // data:  |   10    |      |   25    |
   //        +---------+ ---> +---------+
   // next:  | 0xf4d33 |      | nullptr |
   //        +---------+      +---------+

   // Let's keep going...

   head->next->next = new Node;
   head->next->next->data = 12;
   head->next->next->next = nullptr;

   // We now have the following:
   //
   // addr:    0xf9800          0xf4d33          0xc625e
   //        +---------+      +---------+      +---------+
   // data:  |   10    |      |   25    |      |   12    |
   //        +---------+ ---> +---------+ ---> +---------+
   // next:  | 0xf4d33 |      | 0xc625e |      | nullptr |
   //        +---------+      +---------+      +---------+

   // WARNING!
   // This has a memory leak! We aren't freeing the list before returning from main().

   return 0;
}

That's pretty painful. We obvious can't keep this up. If we want to add a 100th node to a linked list, it would be absurd to have a condition that checked for that and then executed a line with 99 instances of ->next. Let's refine our approach a bit.

Second Draft: A More Refined Approach to Node Creation (and a Print Function)

When I see chunks of repeated code like what we have above (create node, set data field, set next field to nullptr), start to think about using functions to make my code more readable, compact, and maintainable (if I discover a bug in those lines of code, I only want to have to update one copy of those lines -- not several copies that are scattered all over main()). Here's how we did that:

#include <iostream>
#include "console.h"
using namespace std;

struct Node
{
   int data;
   Node *next;  // a pointer to a Node -- the next node in our list
};

// Dynamically allocates a new node with the given data and returns a pointer
// to that node.
Node *createNode(int data)
{
   Node *n = new Node;
   n->data = data;
   n->next = nullptr;
   return n;
}

// Prints the contents of our linked list.
void printList(Node *head)
{
   // Remember that this local variable called 'head' is separate from the one
   // called 'head' down in main()! We can change the address stored in this local
   // copy without changing what 'head' points to back in main(). So, we could
   // have moved 'head' forward through this list instead of creating the helper
   // variable, 'current'.

   Node *current = head;

   while (current != nullptr)
   {
      cout << current->data;

      // If this isn't the last node, print an arrow, because there's more to come.
      if (current->next != nullptr)
      {
         cout << " -> ";
      }

      // Move our pointer forward in the list!
      current = current->next;
   }

   cout << endl;
}

int main()
{
   Node *head = nullptr;

   // We have dramatically refined our approach here, condensing 9 lines down to 3.
   head = createNode(10);
   head->next = createNode(25);
   head->next->next = createNode(12);

   // We now have the following:
   //
   // addr:    0xf9800          0xf4d33          0xc625e
   //        +---------+      +---------+      +---------+
   // data:  |   10    |      |   25    |      |   12    |
   //        +---------+ ---> +---------+ ---> +---------+
   // next:  | 0xf4d33 |      | 0xc625e |      | nullptr |
   //        +---------+      +---------+      +---------+

   printList(head);

   // WARNING!
   // This has a memory leak! We aren't freeing the list before returning from main().

   return 0;
}

(Important note!) We traced through the construction of the printList() function above in class today, starting at timestamp 55:50 in our lecture.

(Important note!) Rather than creating an auxiliary function that exists outside our Node struct, we could create a constructor function within the struct. (C does not allow that sort of thing, but C++ does.) We will see that alternative approach next time.

Third Draft: A More Elegant Assemblage of a Linked List (a Tail Insertion Function)

At the end of class, we saw the following approach to inserting elements at the tail of our linked list. Note the use of a reference parameter, which is described at length in one of the sections of notes above (and which we discussed in detail at timestamps 9:00 and 19:35 in the lecture). The tailInsert() function below is developed starting at timestamp 1:04:25 in today's lecture.

(Important note!) In the code below, if we hadn't initialized head to nullptr in main(), and if we didn't have a compiler flag set to prevent us from using uninitialized variables, then head would contain a garbage memory address, and our tail insertion function would likely have crashed (unless that garbage address just happened to be nullptr, which is certainly within the realm of possibility).

(Important note!) The tail insertion function below has a linear runtime. We will see a trick next time that actually allows us to achieve O(1) tail insertions.

#include <iostream>
#include "console.h"
using namespace std;

struct Node
{
   int data;
   Node *next;  // a pointer to a Node -- the next node in our list
};

// Dynamically allocates a new node with the given data and returns a pointer
// to that node.
Node *createNode(int data)
{
   Node *n = new Node;
   n->data = data;
   n->next = nullptr;
   return n;
}

// Prints the contents of our linked list.
void printList(Node *head)
{
   // Remember that this local variable called 'head' is separate from the one
   // called 'head' down in main()! We can change the address stored in this local
   // copy without changing what 'head' points to back in main(). So, we could
   // have moved 'head' forward through this list instead of creating the helper
   // variable, 'current'.

   Node *current = head;

   while (current != nullptr)
   {
      cout << current->data;

      // If this isn't the last node, print an arrow, because there's more to come.
      if (current->next != nullptr)
      {
         cout << " -> ";
      }

      // Move our pointer forward in the list!
      current = current->next;
   }

   cout << endl;
}

// Insert a new node at the end of the linked list.
void tailInsert(Node *&head, int data)
{
   // If we have an empty list, just create a new node (which becomes our head)
   // and return back to main().
   if (head == nullptr)
   {
      head = createNode(data);
      return;
   }

   Node *current = head;

   // Loop forward until 'current' points to last node, which happens when its
   // 'next' pointer is nullptr (not when current itself is nullptr; if that
   // happens, we will have fallen off the end of our list).
   while (current->next != nullptr)
   {
      current = current->next;
   }

   // When we get here, 'current' points to the last node in the list. We set
   // its 'next' field equal to a new node to tack onto the end of the list.
   current->next = createNode(data);
}

int main()
{
   Node *head = nullptr;

   tailInsert(head, 10);
   tailInsert(head, 25);
   tailInsert(head, 12);

   // We now have the following:
   //
   // addr:    0xf9800          0xf4d33          0xc625e
   //        +---------+      +---------+      +---------+
   // data:  |   10    |      |   25    |      |   12    |
   //        +---------+ ---> +---------+ ---> +---------+
   // next:  | 0xf4d33 |      | 0xc625e |      | nullptr |
   //        +---------+      +---------+      +---------+

   printList(head);

   // WARNING!
   // This has a memory leak! We aren't freeing the list before returning from main().

   return 0;
}

Deletion Functions (and Memory Leak Checks)

Toward the end of class, we briefly examined a very broken approach to deleting all the dynamically allocated nodes in a linked list (see the brokenDeleteList() function below), as well as solid iterative and recursive implementations of the function.

Note: While it's good to have a solid understanding of the iterative and recursive list deletion functions in this section, you do not have to worry about the memory diagnostic tools built into the Stanford C++ libraries. Those tools can be handy for tracking memory leaks in your programming assignments, but you won't be expected to use them on the final exam.

#include <iostream>
#include "console.h"
#include "MemoryDiagnostics.h"  // for memory leak checks
#include "SimpleTest.h"
using namespace std;

struct Node
{
  int data;
  Node *next;

  // This line ensures that if we fail to delete a dynamically allocated
  // node in a test case, we'll get some indication of that once the test
  // case is otherwise passing. Note that there is no alert given for
  // orphaned nodes that are allocated outside of a test case.
  TRACK_ALLOCATIONS_OF(Node);
};

// Returns the address of a dynamically allocated node whose fields have
// been initialized appropriately.
Node *createNode(int data)
{
  Node *n = new Node;
  n->data = data;
  n->next = nullptr;
  return n;
}

// Print our linked list with nice formatting.
void printList(Node *head)
{
  Node *current = head;

  while (current != nullptr)
  {
    cout << head->data;

    if (current->next != nullptr)
    {
      cout << " -> ";
    }

    current = current->next;
  }

  cout << endl;
}

// WARNING! This is a broken approach!
void brokenDeleteList(Node *head)
{
   while (head != nullptr)
   {
      delete head;
      head = head->next;  // OH NOOOOOOOooooooOO! Do not dereference
                          // deleted addresses! This is dangerous and
                          // can lead to unpredictable results!
   }
}

// An iterative version of a deleteList() function. Uses a temp pointer to
// resolve the issue in our broken version above.
void deleteList(Node *head)
{
  while (head != nullptr)
  {
    Node *temp = head->next;
    delete head;
    head = temp;
  }
}

// Recursive list deleting function. This is so neat! It works its way all the
// way through the linked list, then deletes nodes as it returns -- ensuring
// we only ever delete nodes after we have accessed their next fields.
void deleteListRecursive(Node *head)
{
  if (head == nullptr)
  {
    return;
  }

  deleteListRecursive(head->next);
  delete head;
}

// This just shows the behavior of a SimpleTest when it has a memory leak
// but otherwise passes.
PROVIDED_TEST("memleak check - no deletion")
{
  Node *head = createNode(10);
  head->next = createNode(15);
  head->next->next = createNode(12);

  printList(head);
}

// Memory leak check where we call our iterative deletion function.
PROVIDED_TEST("memleak check - iterative delete")
{
  Node *head = createNode(10);
  head->next = createNode(15);
  head->next->next = createNode(12);

  printList(head);
  deleteList(head);
}

// Memory leak check where we call our recursive deletion function.
PROVIDED_TEST("memleak check - recursive delete")
{
  Node *head = createNode(10);
  head->next = createNode(15);
  head->next->next = createNode(12);

  printList(head);
  deleteListRecursive(head);
}

int main()
{
  runSimpleTests(ALL_TESTS);
  return 0;
}

Cheeky Aside: Nodes That Hold Characters

At the very end of class, I quickly modified the Node struct to hold characters instead of integers and talked a bit about seeding random number generators along the way, which you have seen in your current assignment.

#include <iostream>
#include "console.h"
#include "random.h"
using namespace std;

struct Node
{
   char data;
   Node *next;  // a pointer to a Node -- the next node in our list
};

// Dynamically allocates a new node with the given data and returns a pointer
// to that node.
Node *createNode(char data)
{
   Node *n = new Node;
   n->data = data;
   n->next = nullptr;
   return n;
}

// Prints the contents of our linked list.
void printList(Node *head)
{
   // Remember that this local variable called 'head' is separate from the one
   // called 'head' down in main()! We can change the address stored in this local
   // copy without changing what 'head' points to back in main(). So, we could
   // have moved 'head' forward through this list instead of creating the helper
   // variable, 'current'.

   Node *current = head;

   while (current != nullptr)
   {
      cout << current->data;

      // If this isn't the last node, print an arrow, because there's more to come.
      if (current->next != nullptr)
      {
         cout << " -> ";
      }

      // Move our pointer forward in the list!
      current = current->next;
   }

   cout << endl;
}

// Insert a new node at the end of the linked list.
void tailInsert(Node *&head, char data)
{
   // If we have an empty list, just create a new node (which becomes our head)
   // and return back to main().
   if (head == nullptr)
   {
      head = createNode(data);
      return;
   }

   Node *current = head;

   // Loop forward until 'current' points to last node, which happens when its
   // 'next' pointer is nullptr (not when current itself is nullptr; if that
   // happens, we will have fallen off the end of our list).
   while (current->next != nullptr)
   {
      current = current->next;
   }

   // When we get here, 'current' points to the last node in the list. We set
   // its 'next' field equal to a new node to tack onto the end of the list.
   current->next = createNode(data);
}

int main()
{
   Node *head = nullptr;

   setRandomSeed(14104611);

   for (int i = 0; i < 5; i++)
   {
      // The range 65 through 122 corresponds to ASCII values for 'A' through 'Z'
      // and a few other 
      tailInsert(head, randomInteger(65, 122));
   }

   printList(head);

   // WARNING!
   // This has a memory leak! We aren't freeing the list before returning from main().

   return 0;
}

output:

u -> r -> m -> o -> m

What's next?

Tomorrow (Tuesday), we'll write a LOT more linked list code and explore a few variants of linked lists.

Exercises

1. From scratch, replicate all the code we wrote in class today! Challenging yourself to sit down and rewrite this code is one of the best ways to get really great at all that pointer manipulation and dynamic memory management.

2. Trace through the printList() function that we created in class today. Draw boxes for each of the local variables. See if you can convince yourself that this function isn't destroying our list (since we're doing current = ..., not *current = ...).

3. In printList(), get rid of the current variable altogether and simply use the local head variable to loop forward through the list. Again, draw diagrams of what's happening in memory to help convince yourself that changing the local copy of head in printList() will not destroy the list in any way.

4. Why did we need a pointer reference (Node *&head) for our tailInsert() function today, but for printList(), we only used Node *head? Also, be sure you understand why Node * was the return type for our createNode() function.

5. Write a function called void destroyList(Node *&head) that takes the head of a linked list and frees up all dynamically allocated memory associated with that list. Call it on the list above and ensure that it doesn't crash. Be sure to test your destroyer function on linked lists of sizes 1 and 2 as well, and make sure it doesn't crash if it's passed an empty list (a nullptr). Have your function set head to nullptr before returning so that when we get back to main() (or whatever other function might have called the destroyer function), the variable that once held the linked list pointer has been nullified, and we therefore can't accidentally use it to go back to a chunk of memory over which our program no longer has any claim.

6. Be sure to read the course textbook for more detail or alternative perspectives if you're finding this material really tricky.

7. Dive into the linked list programming assignment we release this Wednesday, where you'll get some really solid practice coding with linked lists. If you find that assignment too difficult at first, come back to this page and be sure to work through all the exercises above before moving on. That will likely be less frustrating in the long run if skipping straight to something that feels a bit too advanced at first.