Directories and Links
Lecture Notes for CS 140
Winter 2012
John Ousterhout
- Readings for this topic from Operating System Concepts:
Section 11.3.
- Naming: how do users refer to their files? How does OS
find file, given name?
- First step: file descriptor has to be stored on
disk, so it will persist across system reboots.
- Early UNIX versions: all descriptors stored in a fixed-
size array on disk.
- Originally entire descriptor array was at the outer
edge of the disk. Result: long seeks between descriptors
and file data.
- Later improvements:
- Place descriptor array mid-way across disk.
- Many small descriptor arrays spread across disk, so
descriptors can be near to file data.
- Space for descriptors is fixed when the disk is
initialized, and can't be changed.
- Unix/Linux terms:
- File descriptor is called an i-node
- Index of i-node in the descriptor array: i-number.
Internally the OS uses the i-number as an identifier
for the file.
- When a file is open, its descriptor is kept in main
memory. When the file is closed, the descriptor is
stored back to disk.
- File naming: users want to use text names to refer to files.
Special disk structures called directories are used
to map names to descriptor indexes.
- Early approaches to directory management:
- A single directory for the entire disk:
- If one user uses a particular name, no-one else can.
- Many early personal computers worked this way.
- A single directory for each user (e.g. TOPS-10):
- Avoids problems between users, but still makes it
hard to organize information.
- Modern systems support hierarchical directory structures.
Unix/Linux approach:
- Directories are stored on disk just like regular files (i.e.
file descriptor with 14 pointers, etc.) except file descriptor
has special flag bit set to indicate that it's a directory.
- On some systems user programs can read directories just like
regular files.
- Only the operating system can write directories.
- Each directory contains <name, i-number> pairs in no
particular order.
- The file pointed to by the i-number may be another directory.
Hence, get hierarchical tree structure. Names have
slashes separating the levels of the tree.
- There is one special directory, called the root. This
directory has no name; it has i-number 2 (i-numbers 0 and 1
have other special purposes).
Working directories
- Cumbersome constantly to have to specify the full path name
for all files.
- Have OS remember one distinguished directory per process,
called the working directory.
- If a file name doesn't start with "/" then it is looked up
starting in the working directory.
- Names starting with "/" are looked up starting in the root
directory.
Links
- Hard links:
- It is possible for more than one directory entry to
refer to a single file.
- UNIX uses reference counts in file descriptors to
keep track of hard links referring to a file.
- Files are deleted when the last directory entry
goes away.
- Symbolic links:
- A file whose contents are another file name.
- Stored on disk like regular files, but with special
flag set in descriptor.
- If a symbolic link is encountered during file lookup,
switch to target named in symbolic link, continue
lookup from there.