More Linked Lists

Tuesday, July 29

In today's lecture we will continue to explore linked lists and will examine a number of useful "advanced" linked list operations that will round out our linked list toolkit.

📚 Readings: Text 14.2 - 14.4
📝 Lecture quiz on Canvas
🎬 Lecture video on Canvas [toggle visibility]
📎 FOR_NEXT_TIME.md

Contents

1. Updated Node Struct (with Constructor)

2. Code for Head Insertion and Deletion

3. Tail Pointers

4. Maintaining a Tail Pointer: Runtime Implications

5. Doubly-Linked Lists

6. Implementing Stacks and Queues with Linked Lists

7. Code for Tail Insertion with a Tail Pointer

8. Code for Head Insertion and Deletion While Maintaining a Tail Pointer

9. Implementation of Linked List-Based Queue: LLQueue

10. Reminder: Member Functions with const

11. Trade-offs: Arrays vs. Linked Lists

12. What's next?

13. Exam Prep

Updated Node Struct (with Constructor)

Today, we continued writing linked list functions and discussing their general capabilities, a few variations and enhancements (including doubly linked lists and linked lists where we maintain tail pointers), and some of their advantages and disadvantages relative to arrays.

We started by modifying our Node struct from last time to contain a constructor function, rather than outsourcing node creation to a free-standing createNode() function. We saw the this keyword make a resurgence. Recall that in a class (and also in a struct, as we now know), this is a pointer to the class/struct variable upon which we called the given function, and we can use it to directly access member variables in that class/struct. This is super helpful if we have a local variable in a function that shares a name with one of our struct variables, as in the following example:

struct Node
{
   int data;
   Node *next;

   Node(int data)
   {
      this->data = data;
      next = nullptr;
   }
};

Code for Head Insertion and Deletion

We then wrote the following headInsert() and headDelete() functions. See today's lecture video (starting at timestamp 24:15) for an explanation of how those functions work. They're short, but they're doing some sophisticated things with pointer manipulation, and we traced fairly carefully through some of the details here.

SUPER CRITICAL: In these functions, we are once again passing pointers by reference. This concept is super critical for you to understand as you work with linked lists. For a more detailed explanation, see the section of notes titled "Preliminary Note: Passing Pointers by Reference 🤯" from our first lecture on linked lists this past Monday. Do not let this concept pass by without taking the time to unpack it and asking questions you might have about it in LaIR or OH.

void headInsert(Node *&head, int data)
{
   Node *newHead = new Node(data);
   newHead->next = head;
   head = newHead;
}

int offWithItsHead(Node *&head)
{
   if (head == nullptr)
   {
      error("Atempting to remove from empty list in offWithItsHead()!");
   }

   // Hold onto this value so we can return it.
   int retval = head->data;

   // Hold onto the old head's address so we can delete it after moving the head
   // pointer forward.
   Node *oldHead = head;

   // Move head pointer to next node in list. If there's no other node, this falls
   // off the end of the list and becomes nullptr, which is exactly what we want.
   head = head->next;

   // Now delete the old head node and return the value removed from the list.
   delete oldHead;
   return retval;
}

Tail Pointers

We then talked about runtimes for inserting elements at the head and tail of a linked list.

Head insertion is an O(1) operation. Regardless of how long our linked list is, adding a new element to the beginning of our list is simply a matter of creating a new node, setting its next pointer to the old head of our linked list, and setting the head to point to our newly created node. All of that can be done in O(1) time.

The tail insertion function we saw last time had a linear runtime, but today we explored an idea for speeding that up: maintaining a pointer not just to the head of a linked list, but also to the tail! If we do that, we can achieve O(1) insertion at the tail of our list: we just set the tail's next pointer to a new node, then move the tail pointer forward. Voila! This is a ✨dramatic✨ improvement in runtime, and it only costs us two things: (1) the memory associated with maintaining a tail pointer (typically just 8 bytes in C++) and (2) the complexity and effort associated with writing code that correctly keeps track of the tail pointer.

Maintaining a tail pointer actually even makes our head insertion function more complex. That's because if we pass an empty list to our head insertion function, the new node we create will not only be the first node in the list, but also the last. If we have just one node in our list, it's both the head and the tail -- and so we need our head insertion function to be able to update both our head and tail pointers. (The code for this is included in the notes below, but before we got to coding those functions up in class, we continued with our conceptual exploration of these topics.)

Maintaining a Tail Pointer: Runtime Implications

We then reviewed the runtimes associated with insertion and deletion at both the head and tail of a linked list.

Note that if someone asked what the runtime is for insertion at the tail of a linked list, a precise answer would be to say, "It depends. If we maintain a tail pointer, we can do that in O(1) time. If not, it takes O(n) time."

Note that deletion at the tail of a linked list is O(n), even if we maintain a pointer to the tail of the list. With a tail pointer, we could certainly delete the dynamically allocated tail node in O(1) time, but we would then have to move the tail pointer back by one node in order to keep that pointer current -- and we can't do that in a traditional linked list. We have to start at the beginning of the list and loop forward until we find that next-to-last node, which will take O(n) time.

A common proposal I hear to get around that is to simply maintain a pointer to the next-to-last node, as well. That way, if we delete the tail, we can just set the tail pointer equal to our next-to-last pointer. The problem with this proposal is that the next-to-last pointer needs to move back, as well -- and unless we have a whole series of pointers that give us direct access to each node as we move backward through the list, this just isn't going to work out for us.

To summarize:

	insertion	deletion
head	O(1)	O(1)
tail	O(1) if tail pointer O(n) otherwise	O(1) if tail pointer and doubly-linked O(n) otherwise

Doubly-Linked Lists

One way to implement an efficient tail deletion (which requires not only maintaining a tail pointer, but also being able to efficiently move that tail pointer back by one node in the list) is to maintain what we call a "doubly-linked list:" that is, a list in which every node has not only a next pointer, but a previous pointer, as well. That looks something like this:

seansz ~/Desktop $ listmaker 87 93 12 --doubly-linked

 addr:    0xf9800          0xf4d33          0xc625e
        +---------+      +---------+      +---------+
 data:  |   87    |      |   93    |      |   12    |
        +---------+ ---> +---------+ ---> +---------+
 next:  | 0xf4d33 |      | 0xc625e |      | nullptr |
        +---------+ <--- +---------+ <--- +---------+
 prev:  | nullptr |      | 0xf9800 |      | 0xf4d33 |
        +---------+      +---------+      +---------+
           ^
           head = 0xf9800

seansz ~/Desktop $ _

The upside to doubly linked lists is that we can now achieve O(1) insertion and deletion at the tail of a linked list (provided we have a tail pointer). The downsides are that (1) every node now has an extra pointer (typically 8 bytes in C++), which means each node is up to 20 bytes of memory instead of just 12 (assuming each node also holds a 4-byte integer) -- a 67% increase in space complexity; and (2) all of our functions for inserting, moving, and deleting nodes in our linked lists become appreciably more complex if we have to maintain previous pointers.

(Key take-away!) This is part of a classical trade-off we see frequently in computer science: by using more memory, we can sometimes dramatically improve the runtimes of certain operations.

Implementing Stacks and Queues with Linked Lists

We then talked about how we would use linked lists to implement stacks and queues, with a focus on where we would perform our push and pop operations for the stack and our enqueue and dequeue operations for the queue.

We saw that to implement a stack using a linked list, we would need to push and pop at the same end of our linked list. We can implement those operations efficiently if we always push and pop at the head of our list. As we discussed above, those are O(1) operations. (Popping at the tail would be an O(n) operation even with a tail pointer -- unless we grappled with the added complexity and memory usage of doubly-linked lists.)

To implement a queue using a linked list, we would need to enqueue and dequeue at opposite ends of our linked list. We want to avoid tail deletion for the reasons given above, so we would dequeue at the head (an O(1) operation) and enqueue at the tail (also an O(1) operation, provided that we maintain a tail pointer).

Code for Tail Insertion with a Tail Pointer

In terms of code, we used the idea of a tail pointer to implement a super fancy and fantastic tail insertion function today that had an O(1) runtime, rather than the O(n) approach we saw on Monday. We also switched to using a constructor function without our Node struct today, rather than having a separate createNode() function.

void tailInsert(Node*& head, Node*& tail, int val)
{
   // If tail is nullptr, that means the list is currently empty, and the new node
   // we're creating will be the first node in our list. That means that it is both
   // the head and tail of the list, so we set both of those pointers here.
   if (tail == nullptr)
   {
      head = tail = new Node(val);
      return;
   }

   // Otherwise, if we already have a tail node, simply tack a new node on the end
   // of the list right after it, and then move the tail pointer forward to point
   // to that new node. Tada!
   tail->next = new Node(val);
   tail = tail->next;
}

Code for Head Insertion and Deletion While Maintaining a Tail Pointer

For your reference, head insertion and deletion functions that maintain a tail pointer are given below.

#include <iostream>
#include "console.h"
#include "error.h"
using namespace std;

struct Node
{
   int data;
   Node *next;  // a pointer to a Node -- the next node in our list

   Node(int d)
   {
      data = d;
      next = nullptr;
   }
};

// Insert a new node at the head of the list.
void headInsert(Node *&head, Node *&tail, int data)
{
   Node *newNode = new Node(data);
   newNode->next = head;
   head = newNode;

   // If tail is null and we have been keeping both our head and tail pointers up
   // to date, that means the node we just inserted is the only node currently in
   // the list -- making it both the head and the tail. We update the tail pointer
   // accordingly.
   if (tail == nullptr)
   {
      tail = head;
   }
}

// Removes head of linked list and returns the value it contained.
int offWithItsHead(Node *&head, Node *&tail)
{
   if (head == nullptr)
   {
      error("Empty list in offWithItsHead().");
   }

   // Let's hold onto our return value before deleting the head.
   // "retval" stands for "return value."
   int retval = head->data;

   // Let's hold onto the head node we'll be deleting.
   Node *temp = head;

   // Move our head pointer forward.
   head = head->next;

   // If moving our head pointer forward caused it to fall off the end of the list,
   // that means the list is now empty. We need to update the tail pointer in this
   // case to reflect that.
   if (head == nullptr)
   {
      tail = nullptr;
   }

   // BYE.
   delete temp;

   return retval;
}

int main()
{
   // I cannot stress enough how important it is to initialize these both to nullptr.
   Node *head = nullptr;
   Node *tail = nullptr;

   headInsert(head, tail, 10);
   headInsert(head, tail, 12);
   headInsert(head, tail, 15);

   while (head != nullptr)
   {
      cout << offWithItsHead(head, tail) << endl;
   }

   cout << endl << "Pointers (should be 0 for nullptr):" << endl;
   cout << head << endl;
   cout << tail << endl;

   return 0;
}

output:

15
12
10

Pointers (should be 0 for nullptr):
0
0

Implementation of Linked List-Based Queue: LLQueue

At the end of class, we implemented a queue using linked lists. As discussed above, the efficient implementation is to enqueue at the tail of the list (with a tail pointer) and to dequeue at the head. Our code was as follows (sparsely commented, since we discussed this as we built it up from scratch in lecture today):

node.h:

#ifndef NODE_H
#define NODE_H

struct Node
{
   int data;
   Node *next;

   Node(int d)
   {
      data = d;
      next = nullptr;
   }
};

#endif // NODE_H

llqueue.h:

#ifndef LLQUEUE_H
#define LLQUEUE_H

#include "node.h"  // for Node struct used in private fields

class LLQueue
{
public:
   LLQueue();
   ~LLQueue();
   void enqueue(int data);
   int dequeue();
   int peek() const;
   int size() const;
   bool isEmpty() const;

private:
   Node *_head;
   Node *_tail;
   int _size;
};

#endif // LLQUEUE_H

llqueue.cpp

#include "llqueue.h"
#include "node.h"
#include "error.h"
using namespace std;

LLQueue::LLQueue()
{
   _head = nullptr;
   _tail = nullptr;
   _size = 0;
}

LLQueue::~LLQueue()
{
   // WARNING!
   // This has a memory leak! We aren't freeing the list in our destructor function.
}

void LLQueue::enqueue(int data)
{
   _size++;

   if (_tail == nullptr)
   {
      _tail = _head = new Node(data);
      return;
   }

   _tail->next = new Node(data);
   _tail = _tail->next;
}

int LLQueue::dequeue()
{
   if (_head == nullptr)
   {
      error("Empty list in dequeue()!");
   }

   int retval = _head->data;
   Node *temp = _head;
   _head = _head->next;
   if (_head == nullptr)
   {
      _tail = nullptr;
   }
   delete temp;
   --_size;
   return retval;
}

int LLQueue::peek() const
{
   if (_head == nullptr)
   {
      error("Empty list in peek()!");
   }

   return _head->data;
}

int LLQueue::size() const
{
   return _size;
}

bool LLQueue::isEmpty() const
{
   return _head == nullptr;
}

main.cpp:

#include <iostream>
#include "console.h"
#include "llqueue.h"
using namespace std;

int main()
{
   LLQueue q;

   q.enqueue(10);
   q.enqueue(15);
   q.enqueue(20);
   q.enqueue(12);

   while (!q.isEmpty())
   {
      cout << q.dequeue() << " (new size: " << q.size() << ")" << endl;
   }

   q.enqueue(11);
   q.enqueue(16);
   q.enqueue(21);

   cout << endl;
   while (!q.isEmpty())
   {
      cout << q.dequeue() << " (new size: " << q.size() << ")" << endl;
   }

   return 0;
}

output:

10 (new size: 3)
15 (new size: 2)
20 (new size: 1)
12 (new size: 0)

11 (new size: 2)
16 (new size: 1)
21 (new size: 0)

Reminder: Member Functions with const

This has come up in both lecture and section already, but as a reminder: If we have a member function in some class that we don't want to be able to modify any of our member variables, we can put const at the end of its declaration in our .h file, as with the peek(), size(), and isEmpty() functions above. If we try to modify one of our member variables inside one of those functions, our compiler will actually give us an error now. This is are really neat way to protect ourselves from accidentally modifying a member variable in a function where we're not supposed to do that.

Trade-offs: Arrays vs. Linked Lists

(Not (fully) mentioned in class.) Now that we have fast head and tail insertion, a new tradeoff to consider between arrays and linked lists is this: If we want to implement a stack or queue using an array, most of our insertion operations will be O(1), but every now and then, we might trigger an O(n) expansion (create new, larger array; copy elements from old array into new array; delete old array). If we use a linked list, all our insertion operations, whether they take place at the head or tail of the list, will be O(1). Those O(1) operations are a bit slower than the O(1) array ops because we have to dynamically create a new node, set multiple fields, and update either a head or tail pointer. However, this might be a cost we're willing to pay if we're writing mission critical software where we need to guarantee certain operations will not exceed some small time limit, and so the possibility of an O(n) insertion into some data structure is simply out of the question.

Of course, all the tradeoffs we talked about last time are still in play. Here are some of those tradeoffs again, for your convenience:

Our linked list nodes take up more memory than array cells.
Accessing the k^th element in an array is an O(1) operation. In a linked list, the cost is O(k). Accordingly, linked lists aren't ideal for performing binary search.
Arrays might end up being larger than necessary (wasting space) or smaller (requiring costly expansion operations to accommodate new elements).
Arrays require contiguous blocks of memory. Linked lists can thrive in situations where memory has become fragmented (provided we haven't run out of memory completely).

What's next?

Tomorrow (Wednesday), we'll dive into a new linked data structure: binary trees!

Exam Prep

1. As always, after reviewing today's lecture materials, challenge yourself to reproduce all of today's code from scratch without referring back to the notes for assistance. Be sure to implement the head insertion and deletion functions from today's notes (both with and without maintaining a tail pointer), even though you've seen the code for those already. The key to getting good at linked lists -- and working with pointers in general -- is to engage in lots of coding practice.

Also, if you're uncertain about how any of the functions from these linked list lectures work, draw plenty of diagrams (with memory addresses!) to help trace through what's going on, and come to office hours or LaIR to discuss any lingering questions. Coding with linked lists can be tricky, and we want to help you get a really strong grasp on them!

2. What's wrong with the following approach to head insertion?

void headInsert(Node *head, int data)
{
   Node *oldHead = head;
   head = new Node(data);
   head->next = oldHead;
}

Highlight for solution to Exercise #2: Although the algorithm here is slightly different from the one used in class today (our auxiliary pointer is used to keep track of the old head rather than the new head node), that's not at all problematic. The problem here is unrelated to that: we are not passing the head pointer by reference, so any changes we make to head are local and do not reach main().

3. What all is wrong with the following approach to head deletion?

void headDelete(Node *&head)
{
   int retval = head->data;
   delete head;
   head = head->next;
   return retval;
}

Highlight for solution to Exercise #3: There are two problems here: (1) We do not check whether head is nullptr before dereferencing it. If this function receives an empty list, it will crash with a segmentation fault. (2) We are attempting to access head->next after deleting head. It's dangerous to dereference the head pointer (using the -> operator) after it has been deleted. See solution in today's notes for a workaround.

4. Write a tailDelete() function that maintains both a head and tail pointer. The function signature is:

int tailDelete(Node *&head, Node *&tail)

5. Write head and tail insertion and deletion functions that operate on doubly linked lists. Maintain both head and tail pointers.

6. Write an O(n) destructor function for our LLQueue class that clears up all the dynamically allocated nodes remaining in the queue. This can be done iteratively or recursively. Try to get it working both ways!

7. Continue working on your current assignment, and be on the lookout for a linked list assignment to be released this Wednesday.

8. Friendly reminder: Your final exam will involve writing some fairly complex code on paper. Don't forget to detach from the Qt Creator every now and then and do a bit of coding and debugging practice on paper so that you're ultra-prepared for that when you get into the exam. It's not too early to start preparing.