Saturday, November 26, 2011

Why I switched from Xen to LXC and other regrettable decisions

After playing around a lot with Asterisk I found my closet server was straining under the load. The issue wasn't the hardware, it's just I was running a fairly ancient version of Xen, and I was running that because I couldn't upgrade from Ubuntu 8.04, which was because later versions of Ubuntu didn't properly support Xen (as it was 8.04 barely supported it.)

But it was getting to be a problem. My wife was, understandably, getting upset that all the cool stuff I'd set up with the Asterisk server were somewhat undermined by the fact that after a few hours you couldn't make any outgoing calls. When you did, the entire VM hosting the Asterisk server would freeze for a few seconds, preventing anything from happening.

As an experiment, I made a new VM, and tried to upgrade it to 10.04 using the usual Ubuntu command line tools. This did not work.

I also upgraded the server's memory, but that didn't have any affect either.

So, finally, I bought a big ass drive for the server, and got ready to migrate all the VMs to something that Ubuntu 10.04 would be happy with. And, well, that meant no Xen.

But as it happened, this turned out to be more work than I'd hoped.

A boot up the rear

First problem. Now, I think we can all agree that the PC drive partitioning system sucks. It dates back to the original PC XT, and has nothing going for it. Several efforts have been made to fix the issues with it, or to make something better, including OpenFirmware and EFI, not to mention hacks built upon the existing system like Extended Partitions. The latest attempt to fix the problems is called GPT. It started as a part of EFI, but has started to be supported by more ordinary BIOSes recently because the standard PC system doesn't really do disks over 2T in size terribly well.

So, anyway, the geniuses at Canonical decided to make GPT the default under certain circumstances, including circumstances where it really isn't necessary. If you tell Ubuntu 10.04 to wipe a 2T drive (not a 2.5T drive, that would need it, but an ordinary 2T drive that currently fits within the 2T limit) and put on a fresh Ubuntu install, it will install a GPT partitioning system, and not install a standard partition system.

I found this out the hard way. Now, there is a way around it, for those of us with motherboards more than three years old (I know! We're so behind!) What you do is cat /dev/urandom > /dev/sda (because there's no command to wipe out the GPT partition table, and if you just try to install a normal one using fdisk, then it'll simply be ignored by Ubuntu's installed if there's a GPT thing there too); then you fire up fdisk from the command line, create a new partition table, and create three new partitions, one smallish one (a few gigs) at the beginning, one for your swap partition, and then one big one for the rest of your system.

You then fire up the Ubuntu installer, tell it to format #1 and #3 as ext4, as /boot and / respectively, making #1 bootable, and #2 your swap.

You'll note there's no GUI for this. If you tell Ubuntu to wipe the disk and start fresh, you can't tell it "Oh by the way, please make sure my computer is able to boot from this disk, please?

Urgh.


KVM is not Xen

What's Xen? Well, Xen is a hypervisor. Remember that User Mode Linux thing a long time ago that still exists but nobody uses it? It's a special Linux kernel that's been designed to behave itself so it can run within another operating system. Well, the Xen people went one better and said "Let's make all operating systems run like that, and we'll create a special operating system that's really lightweight in which they can all live." And, well, it works. It's a great idea. And it's what I had on my server.

But Xen isn't universal. Well, actually, it is, because the Xen people recognized early on that not every operating system vendor was going to modify their OSes to play well with others, so they created a special mode for such operating systems that made use of special CPU features, but for Linux, you didn't have to use it. But, nonetheless, people kinda assumed Xen wasn't the way to go because it encouraged rivals to work together, and KVM was born.

KVM simply runs other operating systems under Linux. It's not Xen because those other operating systems don't cooperate, instead KVM simply makes use of features in slightly higher end CPUs to keep the operating systems in line.

Canonical decided to go with KVM and avoided supporting Xen shortly after Ubuntu 8.04 (actually it didn't work very well under 8.04 either.) Unfortunately, however, the very fact KVM needs hardware support means it doesn't replace Xen even if you ignore the major differences in the way they work together.

Unfortunately the cheap CPU in my server doesn't support KVM.

Even if it was, I was already wondering what it would take to migrate my existing VMs. Apparently Redhat is working on a solution, but thus far the nearest I found to a "solution" was a tool called Xenner. I immediately hit a snag (before I realized that my CPU was a piece of crap) as it doesn't seem to take Xen disk images as is, you have to make them more "hard disk" like. Before I had a chance to look for solutions, I found the problem with the CPU.


Other options

At this point, the options were:
  • Try installing Xen server and find some way to make that work after all
  • Look at one of the simpler virtualization solutions like OpenVZ.
OpenVZ isn't directly supported by Ubuntu, but LXC - an OpenVZ spin-off - is. OpenVZ/LXC takes a third approach to "virtualization". Given a fairly common scenario (one that happens to be mine) is to simply run a large number of Linux-based operating systems on a single box, largely to keep different environments from standing on each other's toes and make it easier to experiment, OpenVZ simply runs a single operating system (single pool of processes, single file system, etc), but has the kernel hide this from running processes, who are presented with a sub-view of the running system, Each sub-view appears to each process and user as an independent operating system instance. And the kernel can use quotas and other security tools to limit processes running within a sub-view so they can't take over the entire computer.

Despite the fact Ubuntu nominally "supports" LXC, it's a little messed up under 10.04. To begin with, the 10.04 incarnation actually comes with a major bug that makes the system unusable if you have any major system partitions (such as *cough* /boot) separated from /. Installing from a PPA fixes that issue.

Another is that the documentation is pretty awful. Essentially you're pointed at other people's HOWTOs, that might cover Ubuntu, and which tends to gloss over important details like networking.

What I did was fairly simple:
  • I created directories for each VM under /home/lxc.
  • Each directory contains an lxc.conf file, fstab file, and "root" directory.
  • The root directory was the original VM file system. (I mounted it using losetup, and used cp -a to copy it. Nothing special.)
  • I moved /etc/init.d/udev* and /etc/init.d/check* out of the way as these would cause trouble in the new environment.
  • The lxc.conf and fstabs were cribbed from various blogs. I'd post them, but I'm not sure they're right yet
  • Finally, I set up bridged networking in /etc/network/interfaces. This essentially means everything you'd normally assign to eth0 gets assigned, instead, to br0, and you specify br0 is connected to eth0. 
This, surprisingly, worked. My old Xen 8.04 images are working under a modern kernel, using LXC instead, and actually the entire system feels rather smoother - probably in part because LXC is extremely lightweight.

As if to tell me not to stray too far, Xen gave me a final surprise while I was setting this up. I had to reboot multiple times, switching between the working "old" system, and the new system. At one point I had a shell open on one of my VMs. I then spent fifteen minutes in the new system, before rebooting in the old system so I could access the Internet for a bit.

And to my absolute amazement, the shell session was still alive. When I'd shut down the old system, it had saved the state of the VM, and restored it fifteen minutes later when I rebooted back into the old system.

I love Xen. I really do. Apparently Canonical is rethinking their lack of support for the system. It would be nice to switch back. Perhaps 12.04 will properly support Xen. It's about time they did.

1 comment:

  1. Great article thanks, I am looking to move xen to lxc

    To remove a gpt partition try

    parted /dev/sdX mklabel msdos

    ReplyDelete

Replies are welcome, but be aware comments are moderated. Be friendly, on-topic, and all of the things I'm not!