Raft Project 1 Review/Discussion (Winter 2019)
Lecture Notes for CS 190
Winter 2019
John Ousterhout
Deep Classes
- Small project, so not many opportunities for deep classes.
- Overall goal: make as much of the code as possible general-purpose
- Communication subsystem
- Most common problem: not encapsulating as much functionality as
possible, resulting in shallower classes and information leakage.
- Asynchronous I/O vs. multiple threads?
- Message-based, not streams; streams are awkward, messages make async
I/O easier
- No connections to worry about opening/closing
- TCP's reliability is neither necessary nor sufficient
- Only problem with UDP: limited message length
- Messages: one struct for all, or separate structs for each type?
- Persistent state (small class)
- Most projects tied to Raft state
- Exceptions: PersistentInt (but problematic for other reasons)
- I used PersistentString in my implementation
- State machine
- Separate code for each state
- Results in a lot of code duplication
- Awkward: state changes in the middle of processing a message.
- Separate code for each message type
Synchronization and Threads
- Several projects had a scary number of mutexes
- First, ask: what concurrency is needed? What are we trying to accomplish?
- Will the choice of synchronization mechanism have a big
impact on performance?
- What is the simplest synchronization mechanism that will meet needs?
- In general, use the coarsest-grain locking that will meet your
performance and other needs.
Timers
- Two ways to implement timers
- Separate timer mechanism in its own thread(s)
- Timeouts on I/O operations
- Starting and stopping is awkward
- Can use condition variables with timeouts
- Separate timers for elections and heartbeats can also be awkward
- Must keep track of which is running
- Awkward interfaces: relative vs. absolute time
- My opinion: cleanest to separate the timers from the messaging interface.
- Can't use second-granularity timers
Error Handling
- Common problems:
- Not enough error checks
- Not handled in the best way
- Not enough info in log messages
- In general, unsafe to assume anything about information coming
from outside the process
- Contents of files holding persistent data (e.g. std::stoi).
- Message formats
- Must check results of every kernel call
- What to do when an error occurs? First, think about how it is likely to
be handled. See Example E1.
- Don't exit in low-level methods
- Limits generality
- Bad for unit testing
- Instead, throw exception
- Define specific exception types: don't just use std::exception
(consider likely usage)
- All threads should have top-level exception handlers: catch, log, exit
- Logging is essential:
- Log as often as you can possibly afford
- Include as much information in the log message as possible
(Example E2)
Efficient Code
- It's easy to end up with code that's 5x slower than need be
- Suggestion:
- Learn what things are fast and slow.
- Look for code that's simple but uses approaches that are inherently fast
- Examples of things that are slow:
- Object copying (especially messages, which can be large)
- Rewriting persistent storage on every message
- Opening a new socket for each message
- Storage allocation (e.g. for buffer copies)
- Unordered map vs. vector
- String parsing and formatting
Writing Obvious Code
- C++ constructs to avoid:
- std::pair, std::tuple
- auto
- Use closures judiciously (examples O1-3)
- Iterators are awkward (example O4)
- C++ I/O APIs are clunky (example O5)
Miscellaneous
- Interface comments:
- In header file vs. code file
- Private vs. public methods
- Multiple headers (interface vs. implementation?)
- Documentation: see examples D*
- Bad variable names: see examples V*