Raft Project 1 Review/Discussion (Winter 2023)

Click here for .cc file containing examples.

Class Design

Interesting opportunities for API design:

  • Communication
  • Persistence
  • Raft server state machine
  • Client-side communication
  • State machine (shell command execution)

Most common problems:

  • Specialization: API/implementation tailored to Raft in ways that limit its usage for other things
  • Too many classes (shallow)
  • Fuzzy division of responsibility
    • A class handles part of a problem, but not all of it.
    • Or, multiple implementations of the same thing (e.g., for clients and servers)

Classes for communication:

  • Most common approach:
    • Low-level networking primitives (shallow)
    • Higher-level class (deeper, but often specialized)
  • Connection topology alternatives:
    • Requester opens connections, responses sent back on the same connection as request.
    • Sender opens connections: all outgoing traffic (requests and responses use sender's connection).
    • One connection between each pair of servers.
  • Threading architecture:
    • Must allow independent operation of each socket
      • If one socket blocks, this must not prevent communication to/from other sockets
    • Choice #1: single thread:
      • Simple and clean from synchronization standpoint
      • Use epoll or select to wait for any socket to become ready
      • But, must use nonblocking I/O:
        • Reads may return only part of a message (must save it until the rest arrives).
        • Writes may send only part of the message (must save the remainder to try again later).
    • Choice #2: separate thread(s) for each socket
      • Threads use blocking read/write operations
      • Introduces synchronization issues
      • If there are many connections, this becomes inefficient
      • Does the multi-threaded approach increase server throughput?
    • Observation: the sockets streaming API is awkward for RPCs
  • Unifying client-client and client-server communication:
    • What problems motivated the differences?

Persistence:

  • In most projects this was specialized for Raft:
    • No class: persistence implemented by Raft state machine
    • Separate class, but APIs reflect Raft details such as term and vote

Raft state machine:

  • Collect all of this code into a single class
  • Very simple API:
    • Constructor
    • run method
  • Decomposition choice #1: separate code for each state:
    • One method or class per state
  • Decomposition choice #2: separate code for each message type
  • Threading alternatives:
    • Match network module (e.g. execute commands on per-connection threads)
    • Hybrid (many threads in networking, only one thread in Raft server)
  • Keep synchronization simple!

Raft client:

  • Must do two things:
    • Read commands from stdin (and write results to stdout)
    • Communicate with the Raft cluster
  • Most projects combined this in a single class or file
  • Better to separate them:
    • Reading commands from stdin is just one possible approach
    • For Project 2, design general-purpose class for clients to communicate with a Raft cluster

Application state machine:

  • Most projects hard-wired shell state machine into Raft server
  • This specializes the Raft server
  • For Project 2, create APIs that allow Raft server to support any state machine with string commands and string results.
  • Implement shell state machine as one specific instance (separate from Raft server classes)

Exception Handling

Common problems:

  • Not enough error checks
  • Not enough info in log messages
  • Exceptions not handled in the best way

Must check results of every kernel call

In general, unsafe to assume anything about information coming from outside the process

  • Contents of files holding persistent data (e.g. std::stoi).
  • Messages
  • Command-line arguments

Logging is essential:

  • Log as often as you can possibly afford
  • Include as much information in the log message as possible
  • Log at the scene of the crime, where the most information is available (or, incorporate the info into an exception).

What to do when an error occurs? First, think about how it is likely to be handled.

Don't exit in low-level methods

  • Limits generality
  • Bad for unit testing
  • Instead, throw exception

Define specific exception types: don't just use std::exception or std::runtime_error (consider likely usage)

All threads should have top-level exception handlers: catch, log, exit

Writing Obvious Code

Use closures judiciously (examples Closures1-2)

Spacing and indentation affect readability (examples Spacing1 and Spacing2)

Don't hide things that are important (example NonobviousPersistence)

Do the whole job in one place; don't split it (example Split)

Documentation

Most projects did a pretty good job. Several projects omitted top-level class comments

  • Things to think about:
    • What is the abstraction?
    • What does each instance correspond to?

Thoughts for Project 2

Make classes general-purpose

Avoid specializations and restrictions.

  • Just because you know something doesn't mean you should use that information
  • Delay specialization: push it up to the highest layers of the application

Don't distribute the solution to a problem (information leakage); solve the whole problem in one place

Think about solving big problems, not little ones

  • Design top-down?