Sessions and Cookies
Lecture Notes for CS349W
Fall Quarter 2008
John Ousterhout
- Web application servers are generally "stateless":
- Once the server finishes processing a request, its memory image
is flushed before starting the next request.
- Only information on disk survives from one request to another.
- A series of requests from the same browser appear to the server
as totally independent; it's not obvious that they are all
coming from the same browser or user.
- Statelessness is good in some ways:
- Scalability: in a large application, consecutive requests may
be directed to different servers, so information kept in main
memory won't necessarily be accessible anyway.
- Fault tolerance: if a server crashes between requests, nothing
is lost.
- But statelessness causes problems for application developers:
need to keep track of what a particular user has been doing across
a series of requests:
- What do they currently have in their shopping cart?
- Is this request coming from a logged-in user?
- Solution: sessions
- Session = a pool of data maintained by the server for each
active connection (browser instance).
- The contents of the session are determined by the application.
- Session information is typically stored in files on disk or in
a database on the server: this maintains the stateless property
for the server.
- Whenever the server starts processing a request, it retrieves
the session information for that request.
- Updates to the session during a request are saved on disk
so that they are available to later requests from the same browser.
- How to know which session to use for an incoming request?
Browser cookies.
- Cookie: a small amount of data stored by the browser at the request
of a server.
- Whenever the browser sends a request to a server, it also sends
all cookies that are relevant to that server.
- Cookies are transmitted using header fields in the HTTP protocol.
- Cookie lifecycle:
- The first time a browser connects with a particular server, there
are no cookies.
- The server creates a new session, with a unique identifier, and
returns a cookie containing that identifier in its response.
- The next time that the browser connects with the same server, it
returns the cookie, which the server can use to find the
existing session object.
- Cookie specifications:
- A cookie can contain arbitrary content, but the size is limited
by browsers (typically < 4 KB).
- A server can define multiple cookies with different names, but
browsers limit the number of cookies per server (around 50).
- Each cookie is associated with a particular network address for
the server and, optionally, a particular port and/or a particular
URL prefix: the cookie is only included in requests that match
these values.
- A cookie has an expiration date, after which the browser deletes
it.
- What information is kept in a session?
- Authentication information (which user is logged in for this session).
- State about what the user is doing
- Shopping cart? May be kept in the session, or in the main
database. If kept in the database, the session needs to include
enough information to find the full cart in the database.
- Long-term state (catalog entries, orders submitted, etc.)? No:
this goes in the main database.
- Typically the amount of information kept in the session is relatively
small.
- One possible approach: keep all of the session state in the browser
as a cookie, pass it back to the server on each request:
- Only works if the session state is small.
- Can't trust the browser for sensitive information (such as logged
in user): can be impersonated.
- Solution: sign the session state cryptographically so it can't
be altered in the browser.
- Most frameworks handle all of the details of session management
automatically:
- Must eventually clean up old session information from disk/database.
This may or may not be handled automatically by the framework.