Linkers and Dynamic Linking
Lecture Notes for CS 140
Winter 2012
John Ousterhout
- Readings for this topic from Operating System Concepts:
none.
- When a process is running, what does its memory look like?
A collection of regions called sections.
Basic memory layout for Linux and other Unix systems:
- Code (or "text" in Unix terminology): starts at location 0
- Data: starts immediately above code, grows upward
- Stack: starts at highest address, grows downward
- System components that take part in managing a process's
memory:
- Compiler and assembler:
- Generate one object file for each source code file
containing information for that source file.
- Information is incomplete, since each source file generally
references some things defined in other source files.
- Linker:
- Combines all of the object files for one program into
a single object file.
- Linker output is complete and self-sufficient.
- Operating system:
- Loads object files into memory.
- Allows several different processes to share memory at
once.
- Provides facilities for processes to get more memory after
they've started running.
- Run-time library:
- Works together with OS to provide dynamic allocation routines,
such as malloc and free in C.
- Linkers (or Linkage Editors, ld in Unix,
LINK on Windows): combine
many separate pieces of a program, re-organize storage
allocation. Typically invoked invisibly by compilers.
- Three functions of a linker:
- Collect all the pieces of a program.
- Figure out a new memory organization so that all the
pieces fit together (combine like sections).
- Touch up addresses so that the program can run
under the new memory organization.
- Result: a runnable program stored in a new object file
called an executable.
- Problems linker must solve:
- Assembler doesn't know addresses of external objects when assembling
files separately. E.g. where is printf routine?
- Assembler just puts zero in the object file for each unknown address
- Assembler doesn't know where the things it's assembling will
go in memory
- Assume that things start at address zero, let linker re-arrange.
- Each object file consists of:
- Sections, each holding a distinct kind of information.
- Typical sections: code ("text") and data.
- For each section, object file contains size and current location
of the section, plus initial contents, if any
- Symbol table: name and current location of variable or procedure
that can potentially be referenced in other object files.
- Relocation records : information about addresses referenced
in this object file that the linker must adjust once it knows the
final memory allocation.
- Additional information for debugging (e.g. map from line numbers
in the source file to location in the code section).
- Example files:
main.c:
extern float sin();
extern printf(), scanf();
main() {
double x, result;
printf("Type number: ");
scanf("%f", &x);
result = sin(x);
printf("Sine is %f\n",
result);
}
stdio.c:
int printf(char *fmt, ...) {
...
}
int scanf(char *fmt, ...) {
...
}
math.c:
double sin(double x) {
static double res, lastx;
if (x != lastx) {
lastx = x;
... compute sin(x) ...
}
return res;
}
- Linker executes in two passes:
- Pass 1: read in section sizes, compute final memory layout.
Also, read in all symbols, create complete symbol table in memory.
- Pass 2: read in section and relocation information, update
addresses, write out new file.
- Relocation records:
- Address and size of the value to be relocated
- Symbol that determines amount of relocation
- How to relocate:
- Overwrite with final address of symbol
- Add final address of symbol to current contents; used for
accessing element of record:
y is an external symbol, but the offset q is
known from a header file
- Add difference between final and original addresses of
symbol to current contents
Dynamic Linking
- Originally all programs were linked statically, as described
above:
- All external references fully resolved
- Each program complete
- Since late 1980's most systems have supported shared libraries
and dynamic linking:
- For common library packages, only keep a single copy in memory,
shared by all processes.
- Don't know where library is loaded until runtime; must resolve
references dynamically, when program runs.
- One way of implementing dynamic linking: jump table.
- Jump table: an array in which each entry is a single machine
instruction containing an unconditional branch (jump).
- For each function in a shared library used by the program, there
is one entry in the jump table that will jump to the beginning of
that function.
- If one of the files being linked is a shared library, the linker
doesn't actually include the shared library code in the final program.
Instead it creates a jump table with slots for all of the functions
that are used from that library.
- For relocation records referring to functions in the shared library,
the linker substitutes the address of the jump table entry: when
the function is invoked, the caller will "call" the jump table entry,
which redirects the call to the real function.
- When the program starts up, the shared library is loaded into memory
and the jump table addresses are adjusted to reflect the load location.