Why did I switch?
Quite simply, I had two problems with my existing Xen set up, neither of which is the fault of Xen. I want to make that clear: I love Xen, I love the concept, it's a great system.
But I'm also a Ubuntu guy, and (problem #1) frankly, Xen is one of those things Canonical has never taken seriously. Getting a Xen compatible kernel usually means downloading it from somewhere obscure, and the repositories are far from ideal. You can't, in a 8.04 VM, upgrade to a more recent Ubuntu simply by using the usual upgrade tools. And I needed to upgrade. Everything I was running, from the version of Ubuntu Hardy (8.04) to the version of Xen was old, had little support, and didn't really work the way I wanted. In fact, bugs in the versions of Xen and Linux I was running meant that if any VM had to do a lot of disk activity, the chances were that one or more VMs would crash.
The second issue was my CPU. It's a 64 bit Intel contraption, but unfortunately it doesn't have native virtualization. This is fine for Xen, if you have operating system support (but I'm running Ubuntu, so I don't) but it's not fine for any of Ubuntu's supported virtualization platforms. KVM, as provided by Ubuntu, requires full virtualization be supported in the CPU. Other options such as VirtualBox likewise require CPU support. I still find it ludicrous that Canonical (and, to be fair, the organizations Canonical relies upon) decided to support KVM over Xen when Xen is clearly more efficient and has a much better architecture for this kind of thing. And yes, I know that KVM pays lip-service to paravirtualization, but in practice you can't use it.
This doesn't leave many options without spending money, and right now I didn't want to do that if it was avoidable. That meant looking at other technologies supported (enough) by Ubuntu, and frankly, there aren't many. After reading about OpenVZ (a common technology deployed by VPS vendors), I decided to give LXC a try.
LXC is essentially a "supported" version of OpenVZ - the latter requiring kernel patches, just like Xen. LXC is pseudo-virtualization. Rather than actually emulate a full computer (virtualization) or provide an infrastructure for multiple operating systems to share a computer and allocate resources (para-virtualization), LXC's approach is to have a single operating system kernel run multiple operating system userlands.
LXC's approach is interesting. Services provided by the kernel - file systems, networking, process scheduling, memory allocation, etc, exist once. The kernel hides (or tries to hide, see later) anything that doesn't belong to a process's userland. chroot is used to provide a completely distinct part of the core file system (of course, the administrator can still give a "VM" - called a container - a disk or partition of its own by mounting a disk and chrooting to it.) Each container is given certain rights such as devices it can access and memory and disk usage quotas (which can be unlimited.)
This approach leads to advantages and disadvantages. The primary advantage is efficiency. If there's one kernel running, there's no need to have a layer arbitrating between competing systems or, worse, emulating hardware so that operating systems "think" they have the run of a system. Better still, resources not in use by one environment aren't wasted as they might be in a virtualization or paravirtualization system if the latter has no specific strategy to handle them.
The major disadvantages are:
- Each container's "operating system" must support the provided kernel. In practice, this just means "run something recent, and don't try running an operating system that has wild requirements." The standard Ubuntu kernel is able to host all the major distributions, other Ubuntus (including older versions, see below), CentOS, etc. And yes, you can have different containers have different operating systems. It's just they all have to run Linux.
- LXC is unfinished. As an example, go into a container and type "ls /sys/class/net" and compare it to the output of ifconfig -a in both the container and the "host" system. Both sysfs and procfs have problems with containers, and in some cases, there are actually real security holes - as in you can have a container execute a local script in the host environment. Also there are other little things that don't work, like rebooting or halting containers using the reboot and halt commands.
- LXC doesn't have the more advanced features offered by virtualizers. For example, you can't take snapshots or migrate running VMs from one computer to another.
The latter is considered a major issue by LXC's developers and is being worked on, but it takes time.
The concept behind LXC isn't new by the way. LXC is Linux's answer to BSD's jails system, and jails is often seen as a "fixed" version of chroot, a technology that appeared in Unix a long, long time ago.
Migrating to LXC
I'm bothered by the security aspects of LXC, but for the most part I'm OK using the system, at least in my own environment. There's not a lot worth hacking about my own computer network. Still, I'm looking forward to LXC being finished.
To set up LXC in a Ubuntu 10.04 environment, this is what I did:
1. Installed the latest version
LXC doesn't actually work in the official Ubuntu 10.04 release. You heard that right. It ships with a major bug that causes problems starting up a container if you have multiple volumes mounted. As my /boot is on another partition (2T drive on a BIOS that doesn't support disks that big) mine failed every time with an error about not being able to unmount the root file system.
So the first thing to do is add a third party repository that provides a more recent LXC:
# add-apt-repository ppa:ubuntu-lxc/daily
Networking needs to be manually configured, you don't want Network Manager getting in the way. The easiest way to fix that is to uninstall it:
# apt-get purge network-manager network-manager-gnome
And then there's the installation of lxc and some other important tools:
# apt-get install lxc bridge-utils debootstrap cgroup-bin
2. Configured networking
Networking requires configuration of tunnels, which isn't that hard fortunately. The key thing to understand is that tunnels replace your existing networking configuration. When you configure eth0, for example, you have to do leave as few options (IP addresses, etc) configured as possible. Here's what my /etc/network/interfaces looks like:
# The loopback network interface
iface lo inet loopback
iface eth0 inet manual
iface br0 inet static
post-up /usr/sbin/brctl setfd br0 0
At this point you should reboot to make sure everything is working as wanted.
3. Made a space for the containers to live
I decided to create a user "lxc" which I did using the adduser command in the usual way. Under /home/lxc I put my containers. Each container is a directory, and each directory contains the configuration file, file system mounts, and root directory of the container itself, like so:
I'll explain how to create those files and directories shortly, the important bit right now though is that /home/lxc/container-name exists.
4. I was migrating my existing Ubuntu 8.04 Xen systems. To do this:
4.1 Mount the VM's file system
losetup -f /path/to/disk.img
mount -r /dev/loop0 /media
4.2 Copy the contents making sure permissions etc remain unchanged
cp -a /media /home/lxc/endothelial/ ; mv /home/lxc/endothelial/media /home/lxc/endothelial/root
4.3 Modify the VMs to remove anything that'll interfere with the new environment
mv checkfs.sh checkroot.sh udev udev-finish DISABLE/
4.4 One more modification - Xen uses /dev/xcv0 for the console, change it to /dev/console
Change the last line to:
exec /sbin/getty 38400 console
5. Create the configuration files mentioned above.
lxc.conf looks like this:
lxc.utsname = endothelial
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.mtu = 1400
lxc.network.name = eth0
lxc.network.hwaddr = (my VM's mac address)
lxc.network.ipv4 = (my VM's IPv4 address)
lxc.network.ipv6 = (my VM's IPv6 address, I'm sure this is necessary but...)
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 254:0 rwm
lxc.tty = 4
lxc.rootfs = /home/lxc/endothelial/root
lxc.mount = /home/lxc/endothelial/fstab
fstab looks like this:
none /home/lxc/endothelial/root/dev/pts devpts defaults 0 0
none /home/lxc/endothelial/root/proc proc defaults 0 0
none /home/lxc/endothelial/root/sys sysfs defaults 0 0
6. We want the server to start automatically, so create symlinks in /etc/lxc/auto
# ln -s /home/lxc/endothelial/lxc.conf /etc/lxc/auto/endothelial.conf
7. Start the session
For now, we'll start it interactively. We can always shut it down and start afresh once we know it's working.
# lxc-create -n endothelial -f /home/lxc/endothelial/lxc.conf
(Note, each time you change lxc.conf you need to do an lxc-destroy -n container-name and then do the above command.)
# lxc-start -n endothelial
If everything's set up correctly, your console session should become the container's console and you should be able to log in and all that jazz. To shut down the container, open a different console on the host, and type lxc-stop -n container-name.
The above works for me and I was somewhat surprised by how well it works. Despite the 8.04 images I migrated being completely unaware of the LXC system's existence, they run well with no noticeable problems.
Everything's much faster. I'm sure part of this is that I never had a particularly optimal Xen environment to begin with, but, well, the LXC environment is much more efficient anyway, and it really shows.
Things I'm happy about:
- Fast, efficient, and reliable
- Power management is reliable. Xen had a habit of running everything at full blast
- Migrating back ot Xen should be easy. At worst, I can run LXC within Xen VMs without penalties.
Things I'm uncomfortable with:
- Not 100% transparent. I don't know what will break as a result. Nothing as yet, but it doesn't help my confidence in the platform. Supposedly Oracle will not install because the LXC APIs tell it there's no virtual memory available - but my view is that's a bug in Oracle, who the hell codes that kind of logic into their systems anyway?
- Security issues
- No scope for experimentation with operating systems other than those based on Linux. No Solaris or BSDs for example.
- I'd like to make containers that have no virtual memory for those few applications that absolutely definitely must be available at a moment's notice - Asterisk for example. This isn't possible in LXC, to the best of my knowledge. Still, Asterisk works pretty well in the new environment, while it crashed a lot under Xen.
The list of things I'm uncomfortable with is longer than the "things I'm happy about", but that's a little unfair. Most of the uncomfortable issues are theoretical. The platform is working very well at the moment, and I hope the developers can get the kinks worked out so I can feel more comfortable about the security of my system while running it.