Managing Flash Memory
Lecture Notes for CS 140
Winter 2013
John Ousterhout
- Readings for this topic from Operating Systems: Principles and Practice:
Section 12.2.
- Solid state (semiconductor) storage, replacing disks in many
applications (e.g. phones and other devices). Primary advantages:
- Nonvolatile (unlike DRAM): values persist even if
device is powered off
- Better than disk:
- No moving parts, so more reliable
- Faster access
- More shock-resistant
- But, 5-10x more expensive than disk
- Two styles, NAND and NOR; NAND is most popular today:
- Total chip capacity up to 8 Gbytes today
- Storage divided into erase units (typically 256 Kbytes),
which are subdivided into pages (typically 512 bytes or 4 Kbytes)
- Storage is read in units of pages
- Two kinds of writes:
- Erase: sets all of the bits in an erase unit to 1's.
- Write: modifies an individual page, can only clear bits
to 0 (writing 1's has no effect).
- Can write repeatedly to clear more bits.
- Wear-out: once a page has been erased many times (typically
around 100,000, as low as 10,000 in some new devices) it no longer
stores information reliably.
- Typical flash memory performance:
- Read performance: 20-100 microsconds latency,
100-500 MBytes/sec.
- Erasure time: 2 ms
- Write performance: 200 microseconds latency,
100-200 MBytes/sec.
- In practice, most flash memory devices are packaged with
a flash translation layer (FTL):
- Software that manages the flash device
- Typically provides an interface like that for a disk
(read and write blocks)
- Use with existing file system software
- FTLs are interesting pieces of software, but most FTLs today aren't
very good:
- Sacrifice performance
- Waste capacity
- One possible approach for FTLs: direct mapped (e.g., some cheap
flash sticks)
- virtual block i is stored on page i of the flash device
- Reads are simple
- To write virtual block i:
- Read erase unit containing page i
- Erase the entire unit
- Rewrite erase unit with modified page
- What's wrong with this approach?
- To avoid these problems, must separate virtual block number from physical
location in flash memory, so a given virtual block can occupy different
pages in flash memory over time.
- Keep a block map that maps from virtual blocks to physical pages
- Reads must first lookup the physical location in the block map
- For writes:
- Find a free and erased page
- Write virtual block to that page
- Update block map with new location
- Mark previous page for virtual block as free
- This introduces additional issues
- How to manage map (is it stored on the flash device?)
- How to manage free space (e.g. wear leveling)
- One approach: keep block map in memory, rebuild on startup:
- Don't store block map on flash device
- Each page on flash contains an additional header:
- Virtual block number
- Free/used bit (1 => free)
- Prevalid/valid bit (1 => prevalid)
- valid/Obsolete bit (1 => valid)
- F-P-O bits track lifecycle of page:
- Just erased: 1-1-1
- About to write data: 0-1-1
- Block successfully written: 0-0-1
- Block deleted (new copy written elsewhere): 0-0-0
- Why is 0-1-1 state needed?
- On startup, read entire contents of flash memory to rebuild
block map (32 seconds for 8GB, 512 seconds for 128GB).
- To reduce memory utilization for block map, store block map in
flash, cache parts of it in memory
- Header for each flash page indicates whether that page is a
data page or a map page
- Keep locations of map pages in memory (map-map)
- Scan flash on startup to re-create map-map
- During writes, must write new map page plus new data page
- Some reads may require 2 flash operations
- Obsolete blocks accumulate in erase units, which reduces
effective capacity.
- Solution: garbage collection
- Find erase units with many free pages
- Copy live pages to a clean erase unit (update block map)
- Erase and reuse old erase unit
- Note: must always keep at least one clean erase unit to use for
garbage collection!
- Wear-leveling:
- Want all erase units to be erased at about the same rate
- Use garbage collection to move data between "hot" and "cold"
pages.
- Hard to achieve good performance, good utilization, and longevity
all the same time:
- If the flash device is 90% utilized, write cost increases by
10x:
- To get space for one new page, must garbage collect 10 old
pages
- 9 will still be valid and must be copied
- 1 new page gets written
- Total: 9 reads, 10 writes to write 1 new page!
- This is called write amplification
- Lower utilization makes writes cheaper, but wastes space.
- Frequent garbage collection (e.g. because of high utilization)
also wears out the device faster
- Ideal situation: hot and cold data
- Some erase units contain only data that is never modified ("cold"),
so they are always full and never need to be garbage collected.
- Other erase units contain data that is quickly
overwritten; we can just wait until all of the pages have been
overwritten, then garbage collect the erase unit for free.
- There are ways to encourage such a bimodal distribution.
- Incorporating flash memory as a disk-like device with FTL is inefficient:
- Duplication:
- OS already keeps various index structures for files:
- These are equivalent to the block map
- If OS could manage the flash directly, it could combine
the block map with file indexes
- Lack of information:
- FTL doesn't know when OS has freed a block; only finds out when
block is overwritten
- Thus FTL may rewrite dead blocks during garbage collection!
- Newer flash devices offer trim command that allows OS to
indicate deletion (but must modify OS file systems).
- Better long-term solution: new file systems designed just for flash memory
- Lots of interesting issues and design alternatives
- Has been explored by research teams, but no widely-used
implementations
- Need ability to bypass the FTL
- Interesting opportunity