Large-Scale Web Applications
Lecture Notes for CS 142
Winter 2014
John Ousterhout
- Additional reading for this topic: none.
- Scale of Web applications: 1000x anything previously built.
- Use load balancing to distribute incoming HTTP requests
across many front-end servers:
- DNS (Domain Name System) load balancing:
- Specify multiple targets for a given name
- DNS servers rotate among those targets
- HTTP redirection (HotMail, now LiveMail):
- Front-end machine accepts initial connections
- Redirects them among an array of back-end machines.
- Load-balancing switch ("Layer 4-7 Switch"):
- All incoming packets pass through one switch, which
dispatches them to one of many servers; once TCP
connection established, load balancer will send all
packets for that connection to the same server.
- In some cases the switches are smart enough to inspect
session cookies, so that the same session always goes
to the same server.
- Stateless servers make load balancing easier (different
requests from the same user can be handled by different
servers).
- How to handle session data?
- Different requests may go to different servers
- Individual servers may crash
- Need for session data to move from server to server
as necessary.
- Solution #1: keep all session data in shared storage:
- File system
- Database
- May be expensive to retrieve for each request
- Solution #2: keep session data in cookies
- No server storage required
- Cookies limit the amount of data that can be stored
- Potential security issues
- Solution #3: cache session data in last server that
used it
- Store server map in shared storage
- If future request goes to different server, use
map to find server holding session data, retrieve
data from previous server.
- Scaling the storage system:
- Almost all Web applications start off using relational
databases.
- A single database instance doesn't scale very far.
- Applications must partition data among multiple
independent databases, which adds complexity.
- Example: Facebook had 4000 MySQL servers by 2009
- Memcache: main-memory caching system
- Key-value store (both keys and values are
arbitrary blobs)
- Used to cache results of recent database queries
- Much faster than databases: 500-microsecond access time, vs.
10's of milliseconds
- Example: Facebook had 2000 memcache servers by 2009
- Problems:
- Writes must still go to the DBMS, so no performance improvement
for them
- Cache misses still hurt performance
- Must manage consistency in software (e.g., flush relevant
memcache data when database gets modified)
- Because of scalability problems, we are seeing many new
approaches to storage:
- RAMCloud: new storage system under development in a
research project here at Stanford:
- Store all data in DRAM permanently (no cache misses)
- Aggregate thousands of servers in a datacenter
- Use disk to backup data for high durability and
availability
- 32-256 GB per server
- 100-500 TB per system
- Very high performance:
- 5-10 microsecond access time
- 1 million operations/second/server
- Scaling issues make it difficult to create new Web
applications:
- Initially, can't afford expensive systems for managing
large scale.
- But, application can suddenly become very popular
("flash crowd"); can be disastrous if application
can't scale quickly.
- Can take weeks or months to buy and install new
servers.
- Must become expert in datacenter management.
- Each 10x growth in application scale typically requires
new application-specific techniques.
- Cloud computing:
- Also called Infrastructure as a Service (IaaS).
- Separate scalability issues from application development.
- Specialized providers offer scalable infrastructure.
- Just pay for what you need.
- Example #1: Amazon Web Services
- Elastic Compute Cloud (EC2): rent servers in an Amazon
datacenter for < $0.10/hour
- Scale up and down by hundreds of servers almost instantly
- Simple Storage Service (S3): stores blobs of data
inexpensively (about $0.10/GB/month).
- AWS provides low-level facilities; users still have to
worry about various management issues ("how do I know
it's time to allocate more servers?")
- Gradually adding more facilities:
- Databases
- Management tools
- Example #2: Google AppEngine
- Much higher level interface:
- You provide pieces of Python or Java code, URLs associated
with each piece of code.
- Google does the rest:
- Allocate machines to run your code
- Arrange for name mappings so that HTTP requests
find their way to your code
- Scale machine allocations up and down automatically
as load changes
- AppEngine also includes a scalable storage system
- More constrained environment
- Must use Python, Java, PHP, or Go
- Must use specialized Google storage system
- In the future we are going to see more systems like AWS and
AppEngine, with more and more convenient high-level interfaces.