Data Management
Lecture Notes for CS349W
Fall Quarter 2008
John Ousterhout
- Traditional GUI applications:
- Application-specific file format on disk
- Often binary (not a good idea)
- Recently, movement towards XML
- Application reads file into memory during startup
- Data kept entirely in memory while application runs
- Application-specific data structures provide fast access
- Data in memory is volatile: can be lost after a crash
- Making data durable:
- Explicit user "save" operation: regenerate entire disk file
- Auto-save at regular intervals
- Undo-redo log:
- Flush to disk after every operation
- Delete log during normal application exit
- Check for log during startup: presence of log means
application didn't exit normally
- Replay log to recover work lost during crash
- Problems with this approach for Web applications:
- Designed for single-user applications
- Stateful (vs. stateless trend for Web applications)
- How does this scale over multiple servers?
- How does it handle concurrency from multiple users?
- Web applications today use relational databases for back-end storage:
- Automatically get ACID properties: durability, support for
concurrent access by multiple users, etc.
- Relational model is general enough to support virtually anything.
- Web servers remain stateless:
- Each request executes database queries to retrieve the data it needs.
- If a request makes updates they are committed to the database
by the end of the request.
- Easily scales to multiple servers, each communicating with the
database via a network.
- Problems with RDBMS: performance and scalability
- Performance:
- RDBMS may have to fetch data from disk for every request.
- Possible solution: caching in main memory
- By database itself
- By special caching service such as memcached
- Database storage structures and queries are very flexible and
general, but not optimized for any particular application or
algorithm:
- Pointer-based structures (lists, trees, etc.) seem
particularly difficult to implement efficiently
- RDBMS's grew up in a world of forms where response time of
0.5-5 seconds was fine and complex operations could take much
longer
- Highly interactive applications need response time in 10's of
milliseconds; is this beyond the capability or RDBMSs?
- ElectricCommander experience:
- Data retrieval time for not request almost always >= 250ms.
- Complex requests take 1-30 seconds.
- Significant effort spent tuning queries to reduce response
times >> 10 seconds.
- Example: scrolling the jobs page.
- Scalability: as the number of front-end servers scales, the
throughput of the RDBMS must scale to match
- If the RDBMS runs on a single machine, it will eventually
become the bottleneck
- RDBMS don't appear to scale across networked machines very
well: perhaps 5-10 machines, not 100-1000?
- The relational model tends to fragment data across multiple
rows and tables, making it hard to achieve locality
- Every server involved in every request?
- In practice, every large-scale Web application has scalability
problems with its database backend (Ebay, Facebook, Amazon, etc.)
- Partition database manually into multiple instances
- Hard to manage
- Example: in early 2008 Facebook had 100,000 logical databases on
1800 MySQL instances to support 35000 page views/sec (doubling
every few months)
- There seems to be general agreement that RDBMSs aren't working well
for large-scale Web applications, as well as talk that RDBMSs are
running out of steam overall.
- A technology that has passed its peak?
- If so, what will replace it?
- Alternatives beginning to appear:
- Amazon Dynamo: key-value store
- Google BigTable
- Some thoughts on the ideal storage system:
- Scalability:
- Based on commodity parts (e.g., 1RU rackmounts)
- Scale to 1000's of networked servers
- Should be main-memory based:
- Cost today: $100/GByte marginal cost (cheaper than disk 10 years ago)
- 1000 servers @ 32 GBytes each:
- Example estimate: Amazon
- 400M orders/year? ($4B revenue/$40 average order size)
- 1000 bytes/order? (wild guess)
- 400 GBytes of order data per year
- Example estimate: United Airlines
- 200M passengers/year? (3,500 flights/day?, 150 passengers/flight?)
- 1000 bytes/passenger/flight? (wild guess)
- 200 GB passenger data/year
- Counter-example: Facebook photos
- Photo uploads/day: 24M
- Size of photo: 100K-1Mbyte?
- Total photo bytes/year: 1000-10000 Terabytes
- Change in philosophy:
- Data quote lives" in main memory
- Disk is for backup only, not for normal accesses.
Use undo/redo log for durability?
- Build in replication across servers
- Design system for delivering data at memory speeds
- Performance will be limited by server's basic network packet
processing costs
- Need some way to induce locality:
- Can't afford for every Web page to reference all 1000 data servers
- Latency:
- For widely available Web applications, an issue regardless of
back-end storage:
- Still have to communicate between browser and Web server.
- Cross-country round trip time is 60-70ms best case (speed of light).
- Can reduce latency with distributed data centers
- Techniques for hiding latency:
- Move functionality to browser
- Only works if all relevant data is on the browser
- Best for micro-interactions such as dragging and dropping
(lots of interaction, small amount of data)
- Prefetch data (e.g., Google Maps)
- Increases server load to reduce latency
- Works best if space of likely next data accesses is small
- In general, across all of Computer Systems, latency is a bigger problem
than bandwidth.