Virtualization: One Step Back and Two Steps Forward by Mark Stone
Previous: SETI: The People's Super Computer
While SETI@Home is a brilliant use of otherwise wasted computing resources, it is also an ad hoc solution. It is one piece of software that distributes one very specific application across millions of computers. Suppose you wanted your computer to run dozens, or even hundreds of these distributed applications? How would they be prioritized? How would idle time be determined? Who would play traffic cop inside your computer?
This, of course, is what your computer operating system does. It allocates memory, disk space, and CPU cycles so that all the programs running concurrently on your PC can do so without constantly stumbling over each other. Once applications are distributed across multiple computers, though, you also need the "traffic cop" functions that an operating system performs to be distributed across multiple computers.
The problem is even worse. Not only do we need to distribute applications across multiple computers, but a single computer is really too big a unit to have just a single operating system instance. Today's personal computers have the power of earlier mainframes, and part of the reason they sit idle most of the time is that a single operating system frequently can't assign enough application work to use all the resources available.
These problems — distributing work across multiple computers and segmenting a single computer into multiple work units — are the problems that virtualization is intended to solve. Virtualization is a challenge that many companies and many researchers have tackled, but one of the early innovators in this area was VMWare. VMWare began operations in 1998, and has grown steadily since (a time frame that parallel's SETI@Home's growth). VMWare's software creates a virtual machine layer between applications and hardware. In other words, when an application thinks it is talking to a disk drive or network interface, it is actually talking to the virtual machine's imitation of a disk drive or network interface. The virtual machine, in turn, talks to the actual hardware and, acting as a go between, returns needed information and communications back to the application.
In some sense this is an inefficient step backwards. By introducing a virtual machine layer as another piece of software, you are forcing the computer to dedicate some resources to running this virtual machine software, thus leaving less computing power available to the computer overall. It turns out, though, that this small inefficiency yields huge benefits.
The operating system itself is a software application installed on top of the virtual machine layer. Operating systems installed in this way are called guest operating systems, and the virtual machine layer enables one computer to host multiple guest operating systems. This is advantageous because the virtual machine layer can allocate hardware resources to different guest operating systems based on current needs. One guest may be writing to disk, and need more disk I/O resources. Simultaneously another guest may be sending a file over the network, and need a lot of network bandwidth. Virtualization enables each guest to get what it needs without competing with the other.
Furthermore, while computers may have slight hardware variances even when supplied by the same vendor and sitting side by side in the same data center, virtualization makes those differences disappear from the viewpoint of applications, so that all virtual machine instances look identical. This makes it much easier to move applications, or whole operating system instances, from one physical machine to another.
In other words, by making a small sacrifice in efficiency to run virtualization, we can create computer networks that can:
- Easily run multiple operating system instances per computer
- Dynamically allocate computer resources to operating system instances based on need
- More easily migrate operating system or application instances from one computer to another.
This is the magic we need to have an environment in which it is possible to have most computers in the environment active all the time rather than idle most of the time.
Next: Software as a Service