The Book of Xen - Part 9
Library

Part 9

print"xmsave$domain$destdir/${domain}.xmsaven"; system("xmsave$domain$destdir/${domain}.xmsave");

foreach(@files){ print"copying$_"; system("cp$_${destdir}"); }

foreach$lv(@lvs){ system("lvcreate--size1024m--snapshot--name${lv}_snap$lv"); }

system("xmrestore$destdir/${domain}.xmsave&&gzip$destdir/${domain}.xmsave"); foreach$lv(@lvs){ $lvfile=$lv; $lvfile=~s///_/g; print"backingup$lv"; system("ddif=${lv}_snap

gzip-c>$destdir/${lvfile}.gz"); system("lvremove${lv}_snap"); } Save it as, say, /usr/sbin/backup_domains.sh /usr/sbin/backup_domains.sh and tell and tell cron cron to execute the script at appropriate intervals. to execute the script at appropriate intervals.

This script works by saving each domain, copying file-based storage, and snapshotting LVs. When that's accomplished, it restores the domain, backs up the save file, and backs up the snapshots via dd dd.

Note that users will see a brief hiccup in service while the domain is paused and snapshotted. We measured downtime of less than three minutes to get a consistent backup of a domain with a gigabyte of RAM-well within acceptable parameters for most applications. However, doing a bit-for-bit copy of an entire disk may also degrade performance somewhat.[42] We suggest doing backups at off-peak hours. We suggest doing backups at off-peak hours.

To view other scripts in use at prgmr.com, go to http://book.xen.prgmr.com/.

[42] Humorous understatement. Humorous understatement.

Remote Access to the DomU The story on normal access for VPS users is deceptively simple: The Xen VM is exactly like a normal machine at the colocation facility. They can SSH into it (or, if you're providing Windows, rdesktop rdesktop). However, when problems come up, the user is going to need some way of accessing the machine at a lower level, as if they were sitting at their VPS's console.

For that, we provide a console server that they can SSH into. The easiest thing to do is to use the dom0 as their console server and sharply limit their accounts.

Notea.n.a.logously, we feel that any colocated machine should have a serial console attached to it.[43] We discuss our reasoning and the specifics of using Xen with a serial console in We discuss our reasoning and the specifics of using Xen with a serial console in Chapter14 Chapter14.

An Emulated Serial Console Xen already provides basic serial console functionality via xm xm. You can access a guest's console by typing xm console xm console within the dom0. Issue commands, then type ctrl-] to exit from the serial console when you're done. within the dom0. Issue commands, then type ctrl-] to exit from the serial console when you're done.The problem with this approach is that xm xm has to run from the dom0 with effective UID 0. While this is reasonable enough in an environment with trusted domU administrators, it's not a great idea when you're giving an account to anyone with $5. Dealing with untrusted domU admins, as in a VPS hosting situation, requires some additional work to limit access using has to run from the dom0 with effective UID 0. While this is reasonable enough in an environment with trusted domU administrators, it's not a great idea when you're giving an account to anyone with $5. Dealing with untrusted domU admins, as in a VPS hosting situation, requires some additional work to limit access using ssh ssh and and sudo sudo.First, configure sudo sudo. Edit /etc/sudoers /etc/sudoers and append, for each user: and append, for each user: ALL=NOPa.s.sWD:/usr/sbin/xmconsole Next, for each user, we create a ~/.ssh/authorized_keys ~/.ssh/authorized_keys file like this: file like this: no-agent-forwarding,no-X11-forwarding,no-port-forwarding,command="sudoxm console"ssh-rsa[comment]This line allows the user to log in with his key. Once he's logged in, sshd sshd connects to the named domain console and automatically presents it to him, thus keeping domU administrators out of the dom0. Also, note the options that start with connects to the named domain console and automatically presents it to him, thus keeping domU administrators out of the dom0. Also, note the options that start with no no. They're important. We're not in the business of providing sh.e.l.l accounts. This is purely a console server-we want people to use their domUs rather than the dom0 for standard SSH stuff. These settings will allow users to access their domains' consoles via SSH in a way that keeps their access to the dom0 at a minimum.A Menu for the Users Of course, letting each user access his console is really just the beginning. By changing the command command field in field in authorized_keys authorized_keys to a custom script, we can provide a menu with a startling array of features! to a custom script, we can provide a menu with a startling array of features!Here's a sample script that we call xencontrol xencontrol. Put it somewhere in the filesystem-say /usr/bin/xencontrol /usr/bin/xencontrol-and then set the line in authorized_keys authorized_keys to call to call xencontrol xencontrol rather than rather than xm console xm console.#!/bin/bash DOM="$1"cat< p="">Optionsfor$DOM 1.console 2.create/start 3.shutdown 4.destroy/hardshutdown 5.reboot 6.exit EOF printf">"readX case"$X"in *1*)sudo/usr/sbin/xmconsole"$DOM";; *2*)sudo/usr/sbin/xmcreate-c"$DOM";; *3*)sudo/usr/sbin/xmshutdown"$DOM";; *4*)sudo/usr/sbin/xmdestroy"$DOM";; *5*)sudo/usr/sbin/xmreboot"$DOM";; esac When the user logs in via SSH, the SSH daemon runs this script in place of the user's login sh.e.l.l (which we recommend setting to /bin/false /bin/false or its equivalent on your platform). The script then echoes some status information, an informative message, and a list of options. When the user enters a number, it runs the appropriate command (which we've allowed the user to run by configuring or its equivalent on your platform). The script then echoes some status information, an informative message, and a list of options. When the user enters a number, it runs the appropriate command (which we've allowed the user to run by configuring sudo sudo).[43] Our experience with other remote console tools has, overall, been unpleasant. Serial redirection systems work quite well. IP KVMs are barely preferable to toggling in the code on the front panel. On a good day. Our experience with other remote console tools has, overall, been unpleasant. Serial redirection systems work quite well. IP KVMs are barely preferable to toggling in the code on the front panel. On a good day.PyGRUB, a Bootloader for DomUs Up until now, the configurations that we've described, by and large, have specified the domU's boot configuration in the config file, using the kernel, ramdisk kernel, ramdisk, and extra extra lines. However, there is an alternative method, which specifies a lines. However, there is an alternative method, which specifies a bootloader bootloader line in the config file and in turn uses that to load a kernel from the domU's filesystem. line in the config file and in turn uses that to load a kernel from the domU's filesystem.The bootloader most commonly used is PyGRUB, or Python GRUB. The best way to explain PyGRUB is probably to step back and examine the program it's based on, GRUB, the GRand Unified Bootloader. GRUB itself is a traditional bootloader-a program that sits in a location on the hard drive where the BIOS can load and execute it, which then itself loads and executes a kernel.PyGRUB, therefore, is like GRUB for a domU. The Xen domain builder usually loads an OS kernel directly from the dom0 filesystem when the virtual machine is started (therefore acting like a bootloader itself). Instead, it can load PyGRUB, which then acts as a bootloader and loads the kernel from the domU filesystem.[44]PyGRUB is useful because it allows a more perfect separation between the administrative duties of the dom0 and the domU. When virtualizing the data center, you want to hand off virtual hardware to the customer. PyGRUB more effectively virtualizes the hardware. In particular, this means the customer can change his own kernel without the intervention of the dom0 administrator.NotePyGRUB has been mentioned as a possible security risk because it reads an untrusted filesystem directly from the dom0. PV-GRUB (see "PV-GRUB: A Safer Alternative to PyGRUB?" on PV-GRUB: A SAFER ALTERNATIVE TO PYGRUB? PV-GRUB: A SAFER ALTERNATIVE TO PYGRUB?), which loads a trusted paravirtualized kernel from the dom0 then uses that to load and jump to the domU kernel, should improve this situation.PV-GRUB: A SAFER ALTERNATIVE TO PYGRUB?PV-GRUB is an excellent reason to upgrade to Xen 3.3. The problem with PyGRUB is that while it's a good simulation of a bootloader, it has to mount the domU part.i.tion in the dom0, and it interacts with the domU filesystem. This has led to at least one remote-execution exploit. PV-GRUB avoids the problem by loading an executable that is, quite literally, a paravirtualized version of the GRUB bootloader, which then runs entirely within the domU.This also has some other advantages. You can actually load the PV-GRUB binary from within the domU, meaning that you can load your first menu.lst menu.lst from a read-only part.i.tion and have it fall through to a user part.i.tion, which then means that unlike my PyGRUB setup, users can never mess up their from a read-only part.i.tion and have it fall through to a user part.i.tion, which then means that unlike my PyGRUB setup, users can never mess up their menu.lst menu.lst to the point where they can't get into their rescue image. to the point where they can't get into their rescue image.Note that Xen creates a domain in either 32- or 64-bit mode, and it can't switch later on. This means that a 64-bit PV-GRUB can't load 32-bit Linux kernels, and vice versa.Our PV-GRUB setup at prgmr.com prgmr.com starts with a normal starts with a normal xm xm config file, but with no bootloader and a config file, but with no bootloader and a kernel= line kernel= line that points to PV-GRUB, instead of the domU kernel. that points to PV-GRUB, instead of the domU kernel.kernel="/usr/lib/xen/boot/pv-grub-x86_64.gz"extra="(hd0,0)/boot/grub/menu.lst"disk=['phy:/dev/denmark/horatio,xvda,w','phy:/dev/denmark/rescue,xvde,r']Note that we call the architecture-specific binary for PV-GRUB. The 32-bit (PAE) version is pv-grub-x86_32 pv-grub-x86_32.This is enough to load a regular menu.lst regular menu.lst, but what about this indestructible rescue image of which I spoke? Here's how we do it on the new prgmr.com prgmr.com Xen 3.3 servers. In the Xen 3.3 servers. In the xm xm config file: config file:kernel="/usr/lib/xen/boot/pv-grub-x86_64.gz"extra="(hd1,0)/boot/grub/menu.lst"disk=['phy:/dev/denmark/horatio,xvda,w','phy:/dev/denmark/rescue,xvde,r']Then, in /boot/grub/menu.lst /boot/grub/menu.lst on the rescue disk: on the rescue disk:default=0 timeout=5 t.i.tleXendomainboot root(hd1) kernel/boot/pv-grub-x86_64.gz(hd0,0)/boot/grub/menu.lst t.i.tleCentOS-rescue(2.6.18-53.1.14.el5xen) root(hd1) kernel/boot/vmlinuz-2.6.18-53.1.14.el5xenroroot=LABEL=RESCUE initrd/boot/initrd-2.6.18-53.1.14.el5xen.img t.i.tleCentOSinstaller root(hd1) kernel/boot/centos-5.1-installer-vmlinuz initrd/boot/centos-5.1-installer-initrd.img t.i.tleNetBSDinstaller root(hd1) kernel/boot/netbsd-INSTALL_XEN3_DOMU.gzThe first entry is the normal boot, with 64-bit PV-GRUB. The rest are various types of rescue and install boots. Note that we specify (hd1) (hd1) for the rescue entries; in this case, the second disk is the rescue disk. for the rescue entries; in this case, the second disk is the rescue disk.The normal boot loads PV-GRUB and the user's /boot/grub/menu.lst /boot/grub/menu.lst from from (hd0,0) (hd0,0). Our default user-editable menu.lst menu.lst looks like this: looks like this:default=0 timeout=5 t.i.tleCentOS(2.6.18-92.1.6.el5xen) root(hd0,0) kernel/boot/vmlinuz-2.6.18-92.1.6.el5xenconsole=xvc0 root=LABEL=PRGMRDISK1ro initrd/boot/initrd-2.6.18-92.1.6.el5xen.imgPV-GRUB only runs on Xen 3.3 and above, and it seems that Red Hat has no plans to backport PV-GRUB to the version of Xen that is used by RHEL 5.x.Making PyGRUB Work The domain's filesystem will need to include a /boot /boot directory with the appropriate files, just like a regular GRUB setup. We usually make a separate block device for directory with the appropriate files, just like a regular GRUB setup. We usually make a separate block device for /boot /boot, which we present to the domU as the first disk entry in its config file.To try PyGRUB, add a bootloader= bootloader= line to the domU config file: line to the domU config file: bootloader="/usr/bin/pygrub"Of course, this being Xen, it may not be as simple as that. If you're using Debian, make sure that you have libgrub, e2fslibs-dev libgrub, e2fslibs-dev, and reiserfslibs-dev reiserfslibs-dev installed. (Red Hat Enterprise Linux and related distros use PyGRUB with their default Xen setup, and they include the necessary libraries with the Xen packages.) installed. (Red Hat Enterprise Linux and related distros use PyGRUB with their default Xen setup, and they include the necessary libraries with the Xen packages.) Even with these libraries installed, it may fail to work without some manual intervention. Older versions of PyGRUB expect the virtual disk to have a part.i.tion table rather than a raw filesystem. If you have trouble, this may be the culprit.With modern versions of PyGRUB, it is unnecessary to have a part.i.tion table on the domU's virtual disk.Self-Support with PyGRUB At prgmr.com, we give domU administrators the ability to repair and customize their own systems, which also saves us a lot of effort installing and supporting different distros. To accomplish this, we use PyGRUB and see to it that every customer has a bootable read-only rescue image they can boot into if their OS install goes awry. The domain config file for a customer who doesn't want us to do mirroring looks something like the following.bootloader="/usr/bin/pygrub"memory=512 name="lsc"vif=['vifname=lsc,ip=38.99.2.47,mac=aa:00:00:50:20:2f,bridge=xenbr0']disk=[ 'phy:/dev/verona/lsc_boot,sda,w', 'phy:/dev/verona_left/lsc,sdb,w', 'phy:/dev/verona_right/lsc,sdc,w', 'file://var/images/centos_ro_rescue.img,sdd,r'Note that we're now exporting four disks to the virtual host: a /boot /boot part.i.tion on virtual sda, reserved for PyGRUB; two disks for user data, sdb and sdc; and a read-only CentOS install as sdd. part.i.tion on virtual sda, reserved for PyGRUB; two disks for user data, sdb and sdc; and a read-only CentOS install as sdd.A sufficiently technical user, with this setup and console access, needs almost no help from the dom0 administrator. He or she can change the operating system, boot a custom kernel, set up a software RAID, and boot the CentOS install to fix his setup if anything goes wrong.Setting Up the DomU for PyGRUB The only other important bit to make this work is a valid /grub/menu.lst /grub/menu.lst, which looks remarkably like the menu.lst menu.lst in a regular Linux install. Our default looks like this and is stored on the disk exported as sda: in a regular Linux install. Our default looks like this and is stored on the disk exported as sda: default=0 timeout=15 t.i.tlecentos root(hd0,0) kernel/boot/vmlinuz-2.6.18-53.1.6.el5xenconsole=xvc0root=/dev/sdbro initrd/boot/initrd-2.6.18-53.1.6.el5xen.XenU.img t.i.tlegenerickernels root(hd0,0) kernel/boot/vmlinuz-2.6-xenroot=/dev/sdb module/boot/initrd-2.6-xen t.i.tlerescue-disk root(hd0,0) kernel/boot/vmlinuz-2.6.18-53.1.6.el5xenconsole=xvc0root=LABEL=RESCUE ro initrd/boot/initrd-2.6.18-53.1.6.el5xen.XenU.imgNote/boot/grub/menu.lst is frequently symlinked to either is frequently symlinked to either /boot/grub/grub.conf /boot/grub/grub.conf or or /etc/grub.conf. /boot/grub/menu.lst /etc/grub.conf. /boot/grub/menu.lst is still the file that matters is still the file that matters.As with native Linux, if you use a separate part.i.tion for /boot /boot, you'll need to either make a symlink at the root of /boot /boot that points boot back to that points boot back to . . or make your kernel names relative to or make your kernel names relative to /boot /boot.Here, the first and default entry is the CentOS distro kernel. The second entry is a generic Xen kernel, and the third choice is a read-only rescue image. Just like with native Linux, you can also specify devices by label rather than disk number.WORKING WITH PARt.i.tIONS ON VIRTUAL DISKSIn a standard configuration, part.i.tion 1 may be /boot /boot, with part.i.tion 2 as / /. In that case, part.i.tion 1 would have the configuration files and kernels in the same format as for normal GRUB.It's straightforward to create these part.i.tions on an LVM device using fdisk fdisk. Doing so for a file is a bit harder. First, attach the file to a loop, using losetup losetup:#losetup/dev/loop1claudius.imgThen create two part.i.tions in the usual way, using your favorite part.i.tion editor:#fdisk/dev/loop1Then, whether you're using an LVM device or loop file, use kpartx kpartx to create device nodes from the part.i.tion table in that device: to create device nodes from the part.i.tion table in that device:#kpartx-av/dev/loop1Device nodes will be created under /dev/mapper /dev/mapper in the format in the format devnamep# devnamep#. Make a filesystem of your preferred type on the new part.i.tions:#mke2fs/dev/mapper/loop1p1 #mke2fs-j/dev/mapper/loop1p2 #mount/dev/mapper/loop1p2/mnt #mount/dev/mapper/loop1p1/mnt/bootCopy your filesystem image into /mnt /mnt, make sure valid GRUB support files are in /mnt/boot /mnt/boot (just like a regular GRUB setup), and you are done. (just like a regular GRUB setup), and you are done.[44] This is an oversimplification. What actually happens is that PyGRUB copies a kernel from the domU filesystem, puts it in This is an oversimplification. What actually happens is that PyGRUB copies a kernel from the domU filesystem, puts it in /tmp /tmp, and then writes an appropriate domain config so that the domain builder can do its job. But the distinction is usually unimportant, so we've opted to approach PyGRUB as the bootloader it pretends to be.Wrap-Up This chapter discussed things that we've learned from our years of relying on Xen. Mostly, that relates to how to part.i.tion and allocate resources between independent, uncooperative virtual machines, with a particular slant toward VPS hosting. We've described why you might host VPSs on Xen; specific allocation issues for CPU, disk, memory, and network access; backup methods; and letting customers perform self-service with scripts and PyGRUB.Note that there's some overlap between this chapter and some of the others. For example, we mention a bit about network configuration, but we go into far more detail on networking in Chapter5 Chapter5, Networking. We describe xm save xm save in the context of backups, but we talk a good deal more about it and how it relates to migration in in the context of backups, but we talk a good deal more about it and how it relates to migration in Chapter9 Chapter9. Xen hosting's been a lot of fun. It hasn't made us rich, but it's presented a bunch of challenges and given us a chance to do some neat stuff.Chapter8.BEYOND LINUX: USING XEN WITH OTHER UNIX-LIKE OSS One major benefit of paravirtualization which we've thus far ignored is the ability to run multiple operating systems on a single paravirtualized physical machine. Although Linux is the most popular OS to run under Xen, it's not the only option available. Several other Unix-like OSs can run as a dom0, and rather more have been modified to run as paravirtualized domUs.Apart from Linux, only Solaris and NetBSD are capable of functioning as a dom0 with current versions of Xen. Some work has been done with the other BSDs and with Plan9, but these OSs either can only work as a domU or can only work with older Xen versions. Support is evolving rapidly, however. (FreeBSD seems especially close to having functional Xen bits.) In this chapter, we'll focus on Solaris and NetBSD. Partially this is because they have mature Xen support, with active community involvement and ongoing development. Most importantly, though, it's because we have run them in production. In a later chapter, we'll discuss Windows.Solaris Sun has been pushing Xen virtualization heavily in recent community releases of OpenSolaris, and their effort shows. Solaris works well as both a dom0 and a domU, with closely integrated Xen support. The only caveat is that, as of this writing, OpenSolaris does not support Xen 3.3 and paravirt_ops domUs.NoteSun doesn't actually call their shipping version of Xen Xen. Xen. They use the term They use the term xVM xVM for marketing purposes, and include the unrelated VirtualBox under the xVM label. We're going to continue to call it Xen, however, because it's the name we're used to for marketing purposes, and include the unrelated VirtualBox under the xVM label. We're going to continue to call it Xen, however, because it's the name we're used to.Only the x86 version of Solaris supports Xen-Solaris/SPARC uses alternate virtualization technologies.VIRTUALIZATION WITH SOLARISSun, being traditionally a "medium iron" company, has emphasized virtualization for a long time, with a few different, complementary technologies to implement virtualization at different levels. Here's an overview of their non-Xen virtualization offerings.On new UltraSparc Niagara-based systems, pure hardware virtualization is provided by means of Logical Domains, or LDoms. These are a successor to the Dynamic System Domains Dynamic System Domains found on earlier Sun Enterprise platforms, which allowed you to devote CPU and memory boards to independent OS instances. Similarly, on a reasonably new SPARC box, you can part.i.tion the CPU and memory to run multiple, independent operating systems, using the processor's hardware virtualization support. found on earlier Sun Enterprise platforms, which allowed you to devote CPU and memory boards to independent OS instances. Similarly, on a reasonably new SPARC box, you can part.i.tion the CPU and memory to run multiple, independent operating systems, using the processor's hardware virtualization support.On x86, Sun addresses full virtualization by way of their VirtualBox product. VirtualBox executes guest code directly where possible and emulates when necessary, much like VMware.Finally, Sun addresses OS-level virtualization through Solaris Zones,[45] which are themselves an interesting, lightweight virtualization option. Like other OS-level virtualization platforms, Zones provide a fair amount of separation between operating environments with very little overhead. which are themselves an interesting, lightweight virtualization option. Like other OS-level virtualization platforms, Zones provide a fair amount of separation between operating environments with very little overhead.Sun even offers the option to run Linux binaries under Solaris on x86_64, via lx lx branded Zones. (These branded Zones. (These lx lx branded Zones provide a thin compatibility layer between the Solaris kernel and Linux users.p.a.ce. Pretty cool.) However, the Linux emulation isn't perfect. For example, since branded Zones provide a thin compatibility layer between the Solaris kernel and Linux users.p.a.ce. Pretty cool.) However, the Linux emulation isn't perfect. For example, since lx lx branded Zones use the same Solaris kernel that's running on the actual hardware, you can't load Linux device drivers. branded Zones use the same Solaris kernel that's running on the actual hardware, you can't load Linux device drivers.Getting Started with Solaris To run Solaris under Xen, you'll need to get a copy of Solaris. There are several versions, so make sure that you pick the right one.You do not want Solaris 10, which is the current Sun version of Solaris. Although it's a fine OS, it doesn't have Xen support because its development lags substantially behind the bleeding edge. (In this it caters to its market segment. We are personally acquainted with people who are running Solaris 8-a welcome contrast to the prevailing Linux view that software more than six months old is some sort of historical curiosity.) Fortunately, Solaris 10 isn't the only option. Solaris Express acts as a preview of the next official Solaris version, and it's a perfectly capable OS for Xen in its own right. It incorporates Xen, but is still a bit behind the latest development. It's also not as popular as OpenSolaris.Finally, there's OpenSolaris. Sun released huge tracts of Solaris source code a while ago under the Common Development and Distribution License[46] (CDDL), and the community's been pounding on it ever since. OpenSolaris is the result-it's much like Sun's release of Solaris but with new technology and a much faster release cycle. Think of the relationship between the two as like Red Hat Enterprise Linux and Fedora, only more so. (CDDL), and the community's been pounding on it ever since. OpenSolaris is the result-it's much like Sun's release of Solaris but with new technology and a much faster release cycle. Think of the relationship between the two as like Red Hat Enterprise Linux and Fedora, only more so.