gentoo

Gentoo, the OSL, and Xen

Quite a few of the projects hosted at the OSU Open Source Lab are using Xen virtual machines. If you are associated with one of those projects you may be interested to know what exactly our current setup is and what my future plans for it are. If you are not currently hosted by us maybe you will be some day. :-)

Since last fall the I have been running a Xen cluster at the OSL which is slowly replacing our original independent Xen hosts. We currently host a total of 41 Xen virtual machines which include projects like Busybox, Inkscape, an OFTC IRC node, the Freenode website, the OLPC user support forums, and many others. Currently 17 of those are on the new cluster split between 3 of the 6 available host nodes. The other 24 virtual machines are still on two of our older independent Xen hosts.

The older Xen hosts are just boxes loaded with lots of disk and ram, with the virtual machines running off of the local disk space. The problem with this setup is that Xen and Linux kernel upgrades are incredibly difficult. Since the virtual machines cannot easily move to another host upgrading Xen requires taking an outage for all 8 to 12 virtual machines running on that host. To complicate matters Xen can be a bit troublesome to install/upgrade sometimes so it is not uncommon for such an upgrade to take much longer than expected. To improve this situation I built out our Xen cluster.

The cluster currently consists of 6 Xen hosts which are part of a 14 blade IBM Bladecetner that was donated to us by Intel. The 6 hosts each have 4GB of RAM and dual Pentium 4 processors and can typically run between 6 and 8 virtual machines depending on RAM and CPU needs. The remaining 8 blades will eventually be built out as more hosts but currently are waiting on RAM. (Anyone have a pile of 1GB PC2100 sticks laying around?) All of the disk space is hosted via iSCSI on a separate disk node. The current disk node is a Dell 2650 with 260GB of disk for virtual machines and is serving up that space with ietd since we don't have a hardware based iSCSI target card.

The good thing about this new setup is I can migrate virtual machines between host nodes on demand while they are running so I can easily upgrade the host nodes as needed. Maybe some day I will get better monitoring set up so I can move virtual machines around to balance CPU load but that's not planned for the near future. The bad thing is I still have a single point of failure with the single disk node. Also the disk node doesn't have very much disk space so we have nearly filled it up which is why the cluster is only running 17 virtual machines. So the setup is not perfect but it's a pretty good start using hardware that was either donated or we already had.

Down the road I want to replace the current disk node with two boxes replicating the data using DRBD and set up graceful fail-over between the two using heartbeat. The current plan is to upgrade the disk space on our mirror servers and use some of the old disk arrays for the Xen cluster. This will give us about 3TB total for 1.5TB of redundant disk space between the two disk nodes. That will give us enough space to move all of our existing virtual machines over to the cluster with room for 30-40 more for a total of 54-64. That won't quite fill up the Xen host nodes which can probably host 80-90 virtual machines while keeping one host node as a hot spare. It will be enough room for about a year and a half worth of growth and should enable us to provide great up time for the hosted projects. :-) Unfortunately with this plan the Xen upgrade is waiting on the mirror upgrade which is waiting on money to buy the new disks and I have no idea when that is going to happen. Hopefully something will pull though soon, the mirrors have been needing this upgrade for nearly a year now.

And how does Gentoo fit into all of this? All of the Xen and disk hosts run Gentoo and are managed by our central cfengine system. I have been maintaining the Xen packages for Gentoo to keep them in working order for use at the OSL and the whole setup seems to work pretty well now. Hopefully later today I'll have a chance to start rolling packages for Xen 3.1.3 and 3.2.0.

Giving a chroot its own hostname with chname

Since 2.6.19 Linux has supported a really nifty little feature: utsname namespaces. This is meant for use in fancy container systems but can be useful for simple chroots as well. By creating a chroot in a new namespace the chroot can be given its own hostname. This can be useful for managing a chroot as an independent host or simply making it easy to see if you are in the chroot or not.

A while back I wrote a little tool called chname to make use of the feature when not using a fancy container system. It will start a new process, such as chroot, with a new hostname:

chname newhost chroot /chroots/newhost /bin/bash

And poof!

Hopefully someone else will also find this useful. For Gentoo users it is already in the Portage tree so just emerge it. :-)

Be sure to compile your kernel with CONFIG_UTS_NS=y

General setup  --->
  [*] UTS Namespaces

Update:
For those wondering "Why bother?" I originally wrote this to make running cfengine in a chroot easier. Our cfengine setup at the OSL configures systems based on hostname. Without changing the hostname cfagent must be run as:

cfagent -q -D newhost -D newhost_osuosl_org -N oldhost_osuosl_org

Which is kind of annoying sometimes. Also, thanks to a bug/feature in cfengine if a system hosts a chroot it must always be referred to in the cfengine config as the full oldhost_osuosl_org instead of the nice and shorter oldhost class. It is impossible to unset the class oldhost, but at least undefining oldhost_osuosl_org works. Maybe I'll fix cfengine some day so undefining oldhost works but I kinda like the chname method better.

Cake, Xen, and other such things

FireBasically, I need a smaller laptop. My current one is breaking, heavy to carry around, and most importantly the screen blocks to much of the fire that I am sitting in front of at the moment. Go go back yard fire pits! :-)

Over the past month I have been working full time at the OSL and have been getting quite a bit done which is a very welcome improvement over the school year. I think I accomplished more in the first week of the summer than I did over the entire previous term. I can hardly remember what all I did, but the biggest things have been rewriting our internal inventory app and Xen stuff.

Our original app was written in Ruby on Rails, was slow as heck, and adding stuff required figuring out a new and strange looking language. (And after our experience with trying to run a busy RoR site none of us sys-admins wanted anything to do with it anything in RoR.) Over a couple weeks I managed to plop a new CakePHP based interface on top of the old database, add a handful of new features like automatic database schema upgrades, a simple visual view of each rack, and the whole thing runs a lot faster than the old one did. The app mostly does what we need it to so I've stopped development, any fancier features can wait until RAIV is done.

On the Xen front, I spent part of last week getting the 3.1.0 ebuilds ready to go along 2.6.18 and 2.6.20 kernels! Finally no more 2.6.16! :-) The only significant issue that I know of at this point is that the 2.6.20 kernel will not run as a x86_32p guest on a x86_64 Xen, but since support for that is new to 3.1.0 and the 2.6.18 kernel is working I'm not going to worry about it to much. So unless something significant crops up I'm going to push it all into the portage tree early next week.

Finally! Xen 3.0.4 in Gentoo!

After a much to long wait in my dev overlay Xen 3.0.4 is finally in the main Gentoo tree! I managed to make claim to a Core 2 Duo laptop at the OSL last week so I could finally test it on amd64 and try out fully virtualized guests. Hopefully I will be able to hold onto that machine for a while for all my future Xen testing needs. :-)

Next stages will be to resume work on the new OSL Xen cluster and hopefully get that operational before to long, but we'll see. It is hard to make progress when only working on Tuesday and Thursday.

Syndicate content