Saturday, 13 May 2006

Book Review: User Mode Linux

Nowadays with the advent of more powerful processors and cheaper memory, interest has been re-generated towards virtualization technologies - be it a commercial venture such as Vmware which does full virtualization or open source one which does paravirtualization such as Xen. But besides these two, there is a very interesting project called User Mode Linux (UML) which has gained prominence in the Linux arena. UML is used to create Linux virtual machines which run within a Linux computer. What makes UML stand apart from the rest of the virtualization technologies is that support for it has been incorporated into the official Linux kernel tree. So anybody who has downloaded the kernel source from the official website can easily compile his/her own UML enabled Linux kernel.

UML is the brain child of Jeff Dike who is well known within the Linux community. Now when a person who has created a popular software decides to write a book on the subject, then the book gains a lot of prominence. So when I came across the book titled "User Mode Linux" authored by the very same Jeff Dike and released under the Bruce Perens' Open Source Series, I couldn't resist laying my hands on it.

This book is a relatively small one spanning over 330 pages and divided into 13 chapters and 2 appendices.

The author starts the narration by giving a short introduction on UML and how it differs from other virtualization technologies. In this chapter he shares with the readers the trials and tribulations he faced in getting Linus to incorporate UML patch into the official Linux kernel tree. This chapter also gives a sound idea as to how one can use UML to ones advantage in various practical situations.

In the second chapter titled - "A Quick Look at UML" , one gets a taste of the inner working of UML. For example, with the aid of detailed snippets of output and various commands, the author explains why UML is considered to be a process as well as a kernel at the same time.

What is really interesting about UML is that in most respects it is identical to any normal Linux distribution. And a person who is working inside a UML instance will not be aware that he/she is in fact working in a virtual machine rather than within the host Linux OS. One can do anything and everything in UML that he/she can accomplish in a normal Linux distribution. This includes tasks such as creating partitions, adding swap space, networking and so on. The third chapter of this really useful book titled "Exploring UML" takes an in-depth look at carrying out some of the system administration tasks inside UML. The part where the author demonstrates with the aid of examples, how one can just plug in any file on the host machine into UML and access it inside the UML instance as a block device truly brings out the flexibility of this virtualization technology.

Running a single UML instance is fine. But what happens when one need to run multiple UML instances ? Is it possible to share the filesystem simultaneously between multiple UML instances ? In normal case, if you try and share a filesystem with multiple UML instances, then you run the risk of corrupting the filesystem. This problem is elevated with the use of what are known as Copy On Write (COW) files. The fourth chapter titled - "A second UML instance" pursues this topic in greater detail. The author explains the concept of COW files and how one can use it to share a filesystem between multiple UML instances thus considerably reducing the memory and disk utilization.

This book which explains a niche subject takes a hands-on practical approach which lays stress on getting things done rather than being a mere theoretical discourse. For example, in the fifth chapter titled "Playing with a UML Instance", the author explains how to connect a tar archive residing on the host machine to a UML and access the file from within the UML. He goes on to explain the basic steps needed to get networking between the host machine and the UML. This chapter breaks just the crust of networking as the advanced networking concepts have been provided a dedicated chapter of its own.

The next chapter titled "UML File Management" describes two ways of mounting a directory in the host machine as a UML directory. This is achieved via the virtual filesystems hostfs and humpfs. The uniqueness of these virtual filesystems is that they are not stored within a UML block device rather, the filesystem data is stored inside the UML kernel. In this chapter, the author goes into a detailed analysis of hostfs and humpfs virtual filesystems and explains how one can mount a directory on the host system into a UML instance using either of the methods.

One of the advantages of UML is that in all respects, it works as a complete Linux operating system. And one can do all the things in UML that one can achieve in a normal Linux distribution. UML is particularly strong in the networking area as one can interconnect two or more UML instances to create a virtual network inside ones machine. But one cannot set up networking in UML the same way as one does in a normal Linux distribution. To enable networking in UML, one has to first configure a TUN/TAP device which forms an interface between the UML and the host machine. Also enabling ip forwarding and routing in the host machine forms a part of the job in network enabling our UML instance. On this note, the seventh chapter titled "UML Networking in Depth" is a very important chapter in this book as it takes a step-by-step approach in explaining all these concepts to the reader. By going through this chapter, one acquires deep knowledge of various networking concepts like bridging, switching and various transports like TUN/TAP, Ethertap, SLIP, SLIRP and multicast. At the end of the chapter, the author ties up all the loose ends by giving a complete example of setting up a multicast network of UML virtual machines connecting to form 3 two node networks in which one UML instance acts as a switch.

There is a suite of tools available which aid a system or network administrator to control a UML instance from outside it. Using these tools, one can effectively control the system resources available to a particular UML instance. For example, one can allocate say 256MB memory to a UML instance and just 64MB for another all in real time. The chapter titled "Managing UML Instances from the Host" covers the full suite of UML management tools. In particular, one tool called uml_mconsole is explained in depth in this chapter.

Configuring and running UML instances in a non-production machine setup is fine. But when it comes to using UML on production servers, other aspects also have to be considered. In the next two chapters, the author discusses the various issues including security ones that have to considered and rectified before the users are allowed access to the UML on the server. Traditionally, UML has had two modes of operation, one for unmodified hosts called the "tt" mode and the second one called "skas" mode for hosts that have been patched with what is known as the "skas" patch. Skas is a short term which stands for Separate Kernel Address Space. Recently a third mode has also been added that provides the same security as "skas", plus some of the performance benefits, on unmodified hosts. This third mode of operation is named skas3. All these three modes of operation are covered in detail in the 9th chapter titled "Host Setup for a Small UML Server". In this chapter the author also shares his views on topics as diverse as managing long-lived UML instances, the networking aspects with respect to the server environment, the UML and host memory requirements and so on. Also in the 10th chapter titled "Large UML Server Management", one gets to know the security issues faced when allowing UML access on high traffic servers and the steps needed to be taken to overcome these issues.

Till this point, the author was explaining things assuming that the reader was using a pre-compiled UML enabled Linux kernel. But in the next chapter titled - "Compiling UML from Source", one gets to know which all kernel configuration parameters need to be enabled to compile a UML kernel. Since UML patch is already merged in the official Linux kernel tree, compiling a UML kernel is as simple as enabling the designated kernel configuration parameters. This chapter literally hand holds the reader all the way from downloading the UML kernel source, to setting the kernel flags to the actual compilation. At the end of the chapter, the reader would have accomplished building his own UML kernel.

The 12th chapter is a rather specialized topic and is aptly titled "Specialized UML configurations". Here one gets to know how UML could be used to explore the software limitations on ones machine like the hard limits in the Linux networking subsystem, the performance of large memory UML instances on ones machine as well as setting up a small UML cluster using Oracle's ocfs2.

In the final chapter of this rather well written book, the author shares with the reader the future road map of UML and the technologies that could influence its evolution.

The book also has two appendices which lists all the command line options that could be used while booting a UML instance as well as the various UML utilities that are used on the host side to control the UML.

About the Author
Jeff Dike, the author and maintainer of User Mode Linux is well known throughout the Linux community. He is currently working as an engineer at Intel. He has been active in Linux kernel development for more than five years. He holds a degree in Computer Science and Engineering from MIT.

Book Specifications
Name : User Mode Linux
ISBN No: 0-13-186505-6
Author : Jeff Dike
No of Pages : 330
Publisher : Prentice Hall
Price : Check at Amazon.com
Rating : Excellent

End Note
This is clearly a do-it-yourself book on UML with step-by-step instructions all the way in accomplishing the tasks. But that does not mean that the theoretical aspects of UML have been ignored. Rather, there is a right amount of synergy between theory and practice. And the fact that this book has been authored by the very same person who created UML lends a lot of credibility to this book. This book could meet all the requirements of a person who is interested in this niche area and wishing to gain more knowledge about UML and its working.

No comments:

Post a Comment