Virtual Machine Monitors

Lecture Notes for CS 140
Spring 2018
John Ousterhout

  • Readings for this topic from Operating Systems: Principles and Practice: Section 10.2.
  • What is the abstraction provided by an OS to a process?
    • (Virtual) memory
    • A subset of the instruction set of the underlying machine
    • Most (but not all) of the hardware registers
    • A set of kernel calls with particular arguments for file I/O, etc.
    • Overall: a subset of the facilities of the underlying machine, augmented with extra mechanisms implemented by the operating system.
  • What if we implemented a different abstraction for a process, which looks exactly like the underlying hardware:
    • The complete instruction set of the underlying machine
    • Physical memory
    • Memory management unit (page maps, etc.)
    • I/O devices
    • Traps and interrupts
    • No predefined system calls
  • This abstraction is called a virtual machine:
    • To a "process", it appears that it has its own private machine.
    • Multiple "processes" can share a single machine, each thinking it's running on its own private machine.
    • The operating system for this is called a virtual machine monitor.
    • Can run a complete operating system inside a virtual machine: called a guest operating system.
    • Each virtual machine can run a different guest operating system.

Implementing virtual machine monitors

  • One approach: simulation
    • Write program that simulates instruction execution, like Bochs.
    • Simulate memory, I/O devices also.
    • Examples:
      • Use one large file to hold contents of a "disk"
      • Simulate kernel/user bit, interrupt vectors, etc.
    • Problem: too slow
      • 100x slowdown for CPU/memory
      • 2x slowdown for I/O
  • Better approach: use CPU to simulate itself.
    • Most instructions executed at the full speed of the CPU.
    • Anything "unusual" causes a trap into the virtual machine monitor, which simulates the appropriate behavior.
    • Run virtual machine guest OS like a user process (in unprivileged mode).
  • Special cases:
    • Privileged instructions (e.g. HALT):
      • Since virtual machine runs in user mode, these cause "illegal instruction" traps into VMM.
      • VMM catches these traps, simulates appropriate behavior.
    • Kernel calls in guest OS:
      • User program running under guest OS issues kernel call instruction.
      • Traps always go to VMM (not guest OS).
      • VMM analyzes trapping instruction, simulates system call to guest OS:
        • Move trap info from VMM stack to stack of guest OS
        • Find interrupt vector in memory of guest OS
        • Switch simulated mode to "kernel"
        • Return out of VMM to interrupt handler in guest OS.
      • When guest OS returns from system call, this traps to VMM also (illegal instruction in user mode); VMM simulates return to guest user level.
    • I/O devices:
      • Guest OS writes to I/O device register
      • VMM has arranged for the containing page to fault
      • VMM takes page fault, recognizes address as I/O device register
      • VMM simulates instruction and its impact on the simulated I/O device
      • When actual I/O operation completes, VMM simulates interrupt into the guest OS
      • For better performance, write new device drivers that call directly into the VMM (using system calls): paravirtualization.
    • Virtual memory: VMM uses page maps to simulate virtual memory mapping in guest OS.
      • Three levels of memory:
        • Guest virtual address space
        • Guest physical address space
        • VMM physical memory: VMM must have total control over this
      • Original solution: shadow page maps
        • Guest OS creates page maps, but these aren't used by actual hardware.
        • VMM manages the real page maps; these are called shadow page maps.
        • VMM traps instruction to set the page map base, records info about the guest OS page maps.
        • On page faults, VMM updates shadow page maps using info from guest OS pages tables and its knowledge of physical memory.
        • When guest OS modifies its page maps, guest OS must trap the updates and reflect the changes in the shadow page maps.
        • Two kinds of page faults:
          • Page not in memory: VMM must pass through to guest OS
          • Page in memory, but shadow page map out of date: VMM just updates shadow page map (fault invisible to guest OS)
        • Quite tricky, and potentially slow.
      • More recent solution: extended page maps:
        • Another layer of address translation.
        • Translates from physical addresses (guest-specific) to machine addresses (real memory)
        • VMM controls all of the extended page maps, while guest OS controls normal page maps.
        • Much simpler and more efficient than shadow page maps.
  • Potential problem:
    • VMM must trap any behavior that requires simulation.
      • Special memory locations (e.g. page maps)? Use page faults.
      • Special instructions? Must trap
    • Pathological case:
      • Instruction that is valid in both user mode and kernel mode
      • But, behaves differently in user mode
      • Example: "read processor status" (where kernel/user mode bit is in the status word)
    • Virtualizable: a machine with no such special cases
    • Until recently, very few machines were completely virtualizable (e.g. x86 wasn't until recently)
  • Dynamic binary translation: solution for older machines that are not virtualizable:
    • VMM analyzes all code executed in virtual machine
    • Replaces non-virtualizable instructions with traps
    • Very tricky: how to find all code?
  • In practice, how much overhead do VMMs add?
    • CPU-bound applications: < 5%
    • I/O-bound applications: ~30%

History/usage of virtual machines

  • Invented by IBM in late 1960's
  • Original usage:
    • One VM per user
    • Each user ran a different guest OS
    • Single shared hardware platform
  • Interest died out in the 1980's and 1990's:
    • Each user had a private machine
  • Reinvented, made practical by Mendel Rosenblum and graduate students at Stanford, formed VMware.
  • Software development:
    • Need to test software on different OS versions:
    • Keep one VM for each OS version.
    • Use a single machine to test all versions.
  • Datacenters:
    • Problem: many machines, each running a single application
      • Need separate machines for isolation: application crash could bring down the entire machine
      • Most applications only need a fraction of machine's resources.
    • Solution: datacenter consolidation
      • One VM per application
      • Run several VM's on a single machine
      • Reduce # of machines
  • Encapsulation, restart:
    • VMM can encapsulate entire state of a VM in a file.
    • Can save, continue, restore old state.
    • Datacenter example:
      • Can migrate VM's between machines to balance load
    • Software development:
      • Tests may corrupt the state of the machine
      • Solution:
        • Run tests in a VM
        • Always start tests from a saved VM configuration
        • Discard VM state after tests
        • Results: reproducible tests
  • Security: can monitor all communication into and out of VM.
  • Heavily used in cloud computing (e.g. Amazon Web Services, Google Cloud).