Raft Project 2 Review/Discussion (Winter 2019)

Lecture Notes for CS 190
Winter 2019
John Ousterhout

  • Click here for .cc file containing examples.
  • Lines of code: 1935, 1955, 2062, 2211, 2840, 3480, 3529, 3870, 4001
  • Everyone made improvements on Project 1.
    • Error detection/logging
    • Communication, persistence improved in many projects
  • But, new functionality introduced new complexity

Communication

  • The server-to-server communication model from Project 1 didn't support client communication cleanly.
  • Everyone tried to shoe-horn client communication into their existing model.
  • Why didn't client communication just work out of the box?
  • Suppose you had to design a single (wart-free) mechanism for both client-server and server-server communication?

Together vs. Apart

  • Given various pieces of functionality, which belong together in the same class and which should be separated in different classes?
  • Example: Raft server and state machine for shell?
  • Example: Raft server also manages communication with clients?
  • Example: log class also manages other persistent state such as term and vote?
  • Example: persistent storage class checks for log consistency before accepting new log entries
    bool replicate_log_entries_if_applicable(uint32_t prev_index,
              uint32_t prev_term, std::vector<LogEntry> log_entries);
    

Other Design Issues

  • Memory management, especially for network buffers
    • Copy objects around (safe, but expensive)
    • Pass pointer to dynamically allocated object: hard to prove safety.
    • Use std::shared_ptr or std::unique_ptr
  • Too many mutexes (see Example MUTEX)
  • Error recovery

Raft Protocol Issues

  • Not safe to delete log entries, then overwrite with the same entries.

My Design Proposal for Raft Server

  • Raft server: one thread/peer
    • Responsible for heartbeats, log replication to that peer
    • Also responsible for requesting votes
    • Can operate synchronously (send, wait for response, etc.) without interfering with other peers
    • Sleep when no work to do
  • Additional threads for incoming connections
  • RpcService class
  • RpcClient class
  • Persistence: several of the student solutions would work fine.
  • Managing clients:
    • Separate instance of RPCService
    • One thread per incoming connection
    • Call RaftServer::newCommand(command)
      • Doesn't return until command has been applied or server is no longer leader
      • Inside Raft server, add command to log, then wait on CV until command committed, then apply it, return result.
  • State machine:
    • Provides applyCommand(string *command) API
    • When leader, client threads call applyCommand
    • Raft server contains an additional thread that applies entries when the server isn't leader.
  • Disadvantage of this approach: lots of threads
    • Simple, but probably impractical for a production server

Miscellaneous (if time)

  • Interface comments:
    • In header file vs. code file
    • Private vs. public methods
    • Multiple headers (interface vs. implementation?)
  • Documentation: see examples D*
  • Bad variable names: see examples V*
  • Hard-to-read code: see examples H*
  • Odds and ends: see examples M*