Thrashing and Working Sets

Lecture Notes for CS 140
Winter 2016
John Ousterhout

Readings for this topic from Operating Systems: Principles and Practice: Section 9.7.
Normally, if a thread takes a page fault and must wait for the page to be read from disk, the operating system runs a different thread while the I/O is occurring. Thus page faults are "free"?
What happens if memory gets overcommitted?
- Suppose the pages being actively used by the current threads don't all fit in physical memory.
- Each page fault causes one of the active pages to be moved to disk, so another page fault will occur soon.
- The system will spend all its time reading and writing pages, and won't get much work done.
- This situation is called thrashing; it was a serious problem in early demand paging systems.

How to deal with thrashing?
- If a single process is too large for memory, there is nothing the OS can do. That process will simply thrash.
- If the problem arises because of the sum of several processes:
  - Figure out how much memory each process needs to prevent thrashing. This is called its working set.
  - Only allow a few processes to execute at a time, such that their working sets fit in memory.
Page fault frequency: one technique for computing working sets
- At any given time, each process is allocated a fixed number of physical page frames (assumes per-process replacement).
- Monitor the rate at which page faults are occurring for each process.
- If the rate gets too high for a process, assume that its memory is overcommitted; increase the size of its memory pool.
- If the rate gets too low for a process, assume that its memory pool can be reduced in size.
Scheduling with working sets
- If the sum of all working sets of all processes exceeds the size of memory, then stop running some of the processes for a while.
- Divide processes into two groups: active and inactive:
  - When a process is active its entire working set must always be in memory: never execute a thread whose working set is not resident.
  - When a process becomes inactive, its working set can migrate to disk.
  - Threads from inactive processes are never scheduled for execution.
  - The collection of active processes is called the balance set.
  - The system must have a mechanism for gradually moving processes into and out of the balance set.
  - As working sets change, the balance set must be adjusted.
None of these solutions is very good:
- Once a process becomes inactive, it has to stay inactive for a long time (many seconds), which results in poor response for the user.
- Scheduling the balance set is tricky.
In practice, today's operating systems don't worry much about thrashing:
- With personal computers, users can notice thrashing and handle it themselves:
  - Typically, just buy more memory
  - Or, manage balance set by hand
- Memory is cheap enough that there's no point in operating a machine in in a range where memory is even slightly overcommitted; better to just buy more memory.
- Thrashing was a bigger issue for timesharing machines with dozens or hundreds of users:
  - Why should I stop my processes just so you can make progress?
  - System had to handle thrashing automatically.

CS 140: Operating Systems (Winter 2016)

Thrashing and Working Sets