Xen and iscsi

June 10, 2009 by taskme

Why Xen and ISCSI?

One word, migration.

Xen has this capability of being able to move a virtual image from one machine to another. Usually this is so quick that users of the virtual image do not even notice that anything has changed.

There are a few caveats:

  • The two machines must be on the same LAN segment.
  • The two host machines must be the same architecture.
    I am not sure exactly how far this extends, but my experience is that transitions from intel based hardware to AMD hardware doesn’t work. As well as the more obvious transition between 32 and 64 architectures. This is a shame because many other virtualisation systems can do this. And, being a home user, I have a mixture of ages, types and machine architectures. See here too.
  • The two host machines must both be running xend.
  • The target machine must have enough physical resources (RAM) to run the image.

Another other pain is that the XEN dom0 patch is incompatible with the nvidia hardware 3d acceleration driver. This is an issue because my newest and fastest machines both run the nvidia driver. They also have more memory than my servers. It would be nice, if I needed to run a long job on a virtual machine, to pop it onto my fastest machine, and let it burble away there for a while. But this is not possible if you are using the nvidia driver.

In order for XEN to migrate, the domU has to be able to access it’s disk drives from any of the servers it might be migrated to. This can be achieved with fancy SAN based infrastructure, such as fiber channel, or similar. Or by using the network. There are three ways the network can be used:

  1. Root on nfs
  2. network block devices
  3. iscsi

Because root on NFS has drawbacks in performance and speed, and nfs mounted filesystems are unsuitable for certain tasks, my choice was between NBD and iscsi. NBD have been around for ages, and although mature, they are specific to linux, and have pretty much been superseeded by iscsi.

Iscsi is fairly new, but the latest debian has support for both initiators (the equivalent of a SCSI card) and targets (the equivalent of a SCSI disk) ISCSI brings the kind of flexibility of fiber channel systems to any system with a network card, though not the performance.

These instructions are based on the debian lenny distribution, with extensive references to here and here. They assume that the XEN server will also be the ISCSI target server. If this is not the case, and extra step will need to be inserted to copy the raw disk image onto the ISCSI target. The Lenny installer does not currently support installation directly onto an ISCSI target. However, I am sure this will come soon. The stock kernel supports ISCSI fully, it would just need some tweaks in the installer.

On the dom0 server, create an LVM object big enough to represent the whole disk space needed by your virtual machine. This will be the virtual disk for the domU.

lvcreate -L10240 -n iscsitest system

Download the xm-debian.cfg into /etc/xen. Copy it to a new name, say xm-iscistestinstall.cfg and edit it for your needs, particularly:

memory=128
name="iscistest"
vif=['mac=00:16:3e:1a:2a:2a']

(If you’re using dhcp to provide IP addresses to your hosts, fixing the MAC addresses means that it is more likely that your virtual server will receive the same IP address each time it boots. The first half of the MAC address: 00:16:3e  identifies the address as belonging to a Xen virtual machine. You can put whatever you like in here, so long is it doesn’t clash with any other MAC addresses on your LAN. This example assumes you are running dhcp)

disk = ['phy:/dev/mapper/system-iscsitest,xvda,w']

Identify your logical volume as the main disk for your new machine. We will ultimately remove this line, but for the install, it is needed.

Now run xm create with the installer to create a bootable xen virtual server:

xm create -c xm-iscsitestinstall.cfg install=true
   install-mirror=http://ftp.uk.debian.org/debian install-extra="clocksource=jiffies
   priority=low" install-suite=lenny

See the debian install guide for how to install lenny.

Some notes to bear in mind when installing:

  • When you get to partitioning the disks, I would recommend using a single LVM partition. This is because we will change the underlying physical device name of the disk. LVM will identify the volume group from either the iscsi target, or the xvda drive, meaning that we don’t have to modify /etc/fstab each time we flip between the two devices. The same behaviour could be achieved using volume labels, if you are not so inclined to use LVM.
  • Make sure you select a kernel image with “xen” in the name. And choose the “generic” option for initramfs building.
  • When given the tasksel option, you can choose any configuration you want. If you select the minimum (ie no groups selected) then it will work, but you will need to add some specific packages to get everything to work.  I will identify any special packages you need during the process.
  • There is no benefit in installing a boot loader (grub or lilo). So choose the “do without a bootloader” option.
  • You will need to add some additional packages by hand, before you finish the installation:

Important: Before you “Finish the installation”, choose Execute a shell:

chroot /target
aptitude install openssh-client rsync open-iscsi libc6-xen
cd /boot

Copy the kernel and ram disk onto the xen dom0 server. Otherwise it is not possible to boot the virtual machine.

rsync vmlinuz-2.6.26-2-xen-686 server:/etc/xen

(replace “vmlinuz-….” with the kernel name and “server” with the name of your xen server)

rsync initrd.img-2.6.26-2-xen-686 server:/etc/xen/

Now exit the shell

exit
exit

And choose “Finish the installation”

Now, copy the xm-iscsitestinstall.cfg file to a new name, say xm-iscsitest.cfg and make the following edits:

Remove all the setup sections, pretty much everything before “memory = 128″ and everything including and after “# Debian Installer specific variables” is redundant.

Add the following lines:

kernel="/etc/xen/vmlinuz-2.6.26-2-xen-686"
ramdisk="/etc/xen/initrd.img-2.6.26-2-xen-686"
extra="root=/dev/mapper/itsystem-root ro console=hvc0 clocksource=jiffies"

The “clocksource=jiffies” is very important. I will explain it later. Leave everything else as is.

The following should modify your virtual machine so that it will run with iscsi as the filesystem.

On the xen dom0/iscsi server, install the iscsi target modules and management binary packages:

aptitiude install iscsitarget-modules-2.6.686 iscsitarget

(Remembering to replace the iscsitarget-modules with one relevant for your kernel)

Edit the file /etc/ietd.conf, add the following lines:

Target iqn.2001-04.com.example:storage.lun1
 Lun 0 Path=/dev/mapper/system-iscsitest,Type=fileio
 Alias LUN1

And restart the iscsi target daemon:

/etc/init.d/iscsitarget restart

Run up the virtual machine:

xm create -c xm-iscsitest.cfg

Now your virtual machine is running, but it is still using the /dev/xvda device for its drive.

iscsiadm -m discovery -t st -p <ip address of server>

And connect to the device:

iscsiadm -m node --targetname "iqn.2001-04.com.example:storage.lun1"
  --portal "<ip address of server>:3260" --login
ls /dev

Should show a /dev/sda and /dev/sda1 … devices. Don’t try to access them, however, as you already have them open via the xvda device, and changing them in any way could cause data corruption.

Copy the file:

cp /usr/share/initramfs-tools/scripts/local-top/lvm2 /etc/initramfs-tools/scripts/local-top/lvm2

(The /etc version of this fill will not be overwritten by updates to lvm, whereas the /usr version will)

Edit /etc/initramfs-tools/scripts/local-top/lvm2

Add a line, towards the end of the file, between the lines as shown:

...
modprobe -q dm-mirror

sleep 5 # added by taskme lvm needs to settle before mounting volumes on iscsi

activate_vg "$ROOT"
...

Modify/etc/init.d/open-iscsi

The line in the stop() function stoptargets, comment it out:

...
stop() {
 # stoptargets     # commented out by taskme see bug #501580
 log_daemon_msg "Stopping iSCSI initiator service"
...

This is referred to on debian bug #501580

Now, create a file /etc/iscsi/iscsi.initramfs

With contents:

ISCSI_TARGET_NAME=iqn.2001-04.com.example:storage.lun1
ISCSI_TARGET_IP=<server ip address>

These lines tell the kernel to include the iscsi target code for a root file system into the initial ram disk.

Now, with this setting added, recreate the initial ram disk. This will also add our modification to /etc/initramfs-tools/scripts/local-top/lvm2

dpkg-reconfigure linux-image-2.6.26-2-xen-686

Copy the new initial ram disk onto the server

rsync /boot/initrd.img-2.6.26-2-xen-686 server:/etc/xen/

Finally, the virtual machine now configures the interface from the kernel. Any settings in /etc/network/interfaces seem to muck up the network conectivity of the machine, so comment out all lines in this file that refer to eth0:

# The primary network interface
#allow-hotplug eth0
#iface eth0 inet dhcp

Shutdown the virtual machine:

shutdown -h now

If your dom0 server and iscsi target server are different machines, now is the time to copy the /dev/mapper/system-iscsitest onto the iscsi target, ensuring it is the correct size etc. In theory it would be possible to have the iscsi target made available to the virtual machine using an iscsi initiator on the dom0 machine. An exercise for another day, perhaps.

On the dom0 server, modify /etc/xen/xm-iscsitest.cfg, comment out the disk line:

#disk = ['phy:/dev/mapper/system-iscsitest,xvda,w']

And restart the virtual machine:

xm create -c xm-iscsitest.cfg

And hopefully, your server will be running. Look in /dev to ensure that xvda does not exist.

To allow your server to migrate, it is necessary to modify /etc/xen/xend-config.sxp on the dom0 server. Uncomment the following line:

(xend-relocation-server yes)

On both the main dom0 server and the dom0 server on which you wish to migrate the machine to. But be careful, as this may potentially open up a security hole on your servers.

Restart xend on both machines.

/etc/init.d/xend restart

Now, you can migrate your virtual machine. The second server has to be on the same LAN segment as the master server.

xm migrate iscsitest <ip address or hostname of second server>

Users of the virtual machine should notice a very brief pause as the server migrates (basically the time it takes to copy the virtual RAM of the server across your network). The -l options should eliminate any noticable pause, though I found this didn’t work as reliably.

The reason behind clocksource=jiffies

This setting is vitally important on debian systems running with root filesystems on iscsi. This is because the TCP sequence numbers are determined by the system clock. The bug in the lenny xen implementation means that the domU clocks can get out of sync with the dom0 clock and go backwards. This means that your TCP sequence numbers are invalid, and the TCP connection fails. You therefore loose your root filesystem. This setting de-couples the virtual machine system clock from the dom0 sufficiently that the clock does not run backwards, and your TCP connections keep working.

New home server: Part 3, Layer 2 firewall

June 10, 2009 by taskme

Since learning of the ethernet bridging capability of linux, the brctl(8) and related management utilities, I have imagined that running the ultimate firewall could be one that runs at layer 2, but understands layer 3 network protocols.

This strikes me as a very elegant solution. The firewall would be almost un-hackable, as anyone attacking from the internet would only be able to attack at layer 3, but as the firewall is essentially invisible at layer 3, it would be difficult to compromise it, without first compromising a machine on the section of ethernet. The machine doesn’t even have to have an IP address, and it should still work.

I am aware that a few people tout such a system as a good idea, but I have been unable to find much detail on any solutions. Still, fewer people are using virtual hosts for firewalling.

But, would it even be possible to run a layer two machine within a virtual environment? Enter the twisty turny world of ethernet bridging, virtual interfaces and arp packet mangling.

Would it be possible to do NAT? This would be ideal as I can have a clean break across the firewall, with most of my LAN hosts unaware that they have internet routable addresses. We shall see.

This example is based on a xen virtual server, but the details apply equally to any machine with two network interfaces (be they physical or virtual)

Perform a basic debian install. Only install the base packages plus ebtables and arptables. I also installed tshark for diagnostics and less for my own sanity.

Modify /etc/network/interfaces.

The debian mechanism for managing the network interfaces has always worked very well for me. It is a constant pain that although the underlying tools are the same, ie. ifconfig(8), and now ip(8), each linux distribution uses a completely  different method of setting permanent values for the network settings. The redhat distributions use a script in /etc/sysconfig/network-scripts for each of the permanent interfaces. Slackware just has an init script with a load of ifconfig lines, although slackware is always about 10 linux years behind everything else. Even solaris is so different from anything else, trying to move between the different systems is a headache. Debian’s system although it is unique to debian, (and presumably debian based distributions) seems to sit at a nice point on the complexity/usability curve where other systems end up too far one way or the other. I particularly like how everything is in a single file. And how, you can fix an interface to be a particular MAC address even if the interface isn’t brought up automatically by the system. Very useful for virtual interfaces such as used by openvpn. Whenever I have thought that interfaces(5) wouldn’t be able to achieve what I need, and I would have to start messing with rc scripts and hard coded calls to /sbin/ifconfig, it has always pleasantly surprised me. Today was no exception.

After the install, the stanza for the active interface will probably look like:

auto eth0
iface eth0 inet static
 address XXX.XXX.XXX.65
 netmask 255.255.255.0
 broadcast XXX.XXX.XXX.255
 gateway XXX.XXX.XXX.1

or

allow-hotplug eth0
iface eth0 inet dhcp

Replace the static or dhcp with “manual”. Now, manual doesn’t mean “ignore the interface during the boot sequence, I will bring it up by hand”, as you might think, but it means that you will supply suitable “up” directives to configure the interface “manually”. The former explanation would be redundant within the context of the interfaces mechanism.

For the two interfaces apply the following settings:

auto eth0
iface eth0 inet manual
 up /sbin/ip link set eth0 up
 up /sbin/ip link set eth0 arp off
 down /sbin/ip link set eth0 down

auto eth1
iface eth1 inet manual
 up /sbin/ip link set eth1 up
 up /sbin/ip link set eth1 arp off
 down /sbin/ip link set eth1 down

The interface in an up state will allow traffic to cross it, even if the interface doesn’t have an IP address. Remember that ethernet is just a layer two protocol, and that IPX, decnet and other layer three protocols can all share the same network with IP.

ip(8), is the new all singing all dancing version of ifconfig/route. It is supposed to be the way forward in configuring your linux interfaces. It is very different from ifconfig, but is supposed to be a single executable that can be used to configure network interfaces, addresses, masks, routes etc. It is certainly more precise than ifconfig, which tends to leave interfaces in unexpected states (ifconfig down not work for anyone else?). However, everyone is supposed to start using it as one day ifconfig might just top working!

In addition, add a bridge interface. This causes the bonded ethernet interfaces to work like two ports on an ethernet switch, but of course, since the etch release of debian, you can apply iptables based rules to the traffic.

auto br0
iface br0 inet manual
 pre-up /usr/sbin/brctl addbr br0
 pre-up /usr/sbin/brctl addif br0 eth0
 pre-up /usr/sbin/brctl addif br0 eth1
 up /sbin/ip link set br0 up
 down /sbin/ip link set br0 down
 post-down /usr/sbin/brctl delif br0 eth1
 post-down /usr/sbin/brctl delif br0 eth0
 post-down /usr/sbin/brctl delbr br0

With this configuration, the virtual machine successfully routes ethernet packets. And I can use iptables to block or permit certain host/port combinations. But I am unsure of the next step:

Now, although I have successfully used a bridging firewall, I have no idea whether a bridging firewall is capable of performing NAT. In theory it should be possible, but theory and practice are often far removed.

I perceive two steps:

  1. The layer three packet manipulation, which is using the SNAT and DNAT rules in iptables -t nat.
  2. The ARP manipulation. ARP is an interesting protocol, as it sits somewhere between layers two and three. In fact, I have seem some call it layer two, and others call it layer three. In truth, it is neither, and both. It sits between the two layers, and mediates information between both the layer three level and layer two.

The layer thee issue is fairly well established. I have used NAT for a number of years. The ARP layer, I am going to have to learn!

As I understand ARP, it allows devices on an LAN to correlate layer three addresses (IP addresses) with layer two addresses (MAC addresses). Each device maintains a look up table of IP/MAC addresses, and each address stays valid on that table until about 30 seconds after the last packet was exchanged with that host.

There are two basic type of arp traffic, the “Who has” packet, and the “is at”.

The “Who has” packet is target at the layer two broadcast address. It basically says. I have IP address X.X.X.X and am looking for a host with IP address Y.Y.Y.Y.

The host with address Y.Y.Y.Y then responds with a directed response to say Y.Y.Y.Y is at MM:MM:MM:MM:MM:MM (layer two address). An ARP exchange as recorded by tshark:

163045.629003 4a:5a:6a:1a:2a:3a -> Broadcast    ARP Who has Y.Y.Y.Y?  Tell X.X.X.X
163045.629003 4b:5b:6b:1b:2b:3b ->  4a:5a:6a:1a:2a:3a ARP Y.Y.Y.Y is at 4b:5b:6b:1b:2b:3b

Crossing a bridging/NATting firewall, the layer three addresses are different on one side to the other. Therefore, the Y.Y.Y.Y and X.X.X.X need to be translated when they cross the firewall.

This, it turns out, is fairly straight forward.

arptables -A FORWARD --source-ip IX.IX.IX.1 --destination-ip IX.IX.IX.2
    -j mangle --mangle-ip-s NX.NX.NX.254 --mangle-ip-d NX.NX.NX.1
arptables -A FORWARD --source-ip NX.NX.NX.1 --destination-ip NX.NX.NX.254
    -j mangle --mangle-ip-s IX.IX.IX.2 --mangle-ip-d IX.IX.IX.1

Using this, I successfully see incoming ARP requests being correctly translated. The router’s arp table contains the correct MAC address from the host, inside the firewall.

This, however, is where it goes wrong.

The Layer three part of netfilter seems not to be able to translate the address correctly. The following rule is based on a rule that I have used for many years to modify an incoming packet to make it appear to be destined for a private address.

iptables -t nat -A PREROUTING -d IX.IX.IX.1 -i eth1 -j DNAT --to-destination 1.2.3.4

As the packet crosses eth1, the external interface, it should be translated to appear as 1.2.3.4. Unfortunately this does not happen, and you see pakets on the LAN within the firewall with the external to address:

163045.629003 IY.IY.IY.IY -> IX.IX.IX.1 TCP 42417 > ssh [SYN] Seq=0 Win=5840
    Len=0 MSS=1380 TSV=75548755 TSER=0 WS=7

Of course, there are no machines with that IP address on that LAN segment, so the target machine doesn’t respond. One clue might be the chain name “PREROUTING”. The kernel may do the address translation as part of the routing stack. Being a layer two firewall, it is not responsible for routing.

To find out what is wrong, and if it can be fixed, I will have to look at the kernel source and talk to the netfilter people.

New home server: Part 2, Xen and the art of virtualisation

June 5, 2009 by taskme

Or should that be virtualization? Well the wordpress spellchecker likes neither. It doesn’t even like wordpress!

The problem with my current server has basically been that it has been added to, layers upon layers of software, java, stuff for viewing and processing digital photos, mplayer, and pretty much anything that I felt was a good idea at the time. There is a whole load of stuff on there, that will probably never again be used, but is occupying disk space, presenting security vulnerabilities, and generally making a messy working environment.

Ideally I would want a series of servers that I could leave well alone, save for security updates and a few tweaks. One server that manages the user accounts, one that has all the user files. An internet facing server, that can run email and web, and just that. A separate firewall. These machines would not provide shell access. Then in contrast, one or more general servers, that I can process digital pictures on, add software on to evaluate, or batch process a load of mp3s or whatever. The general purpose server, can almost be throwaway, in that it could be re-created every so often, maybe even run from a LVM snapshot. The throwaway servers , I am less worried about keeping clean and secure. They may not even be running most of the time, so would not present a security hazard.

However, the overhead of running a whole lot of machines for each of these tasks would take up a lot of space, use a lot of electricity and be expensive to buy.

Step in Xen.

I have wanted to try XEN out in a “production” situation for a while. But my 24/7 server had a paltry 640Mbytes of memory, and the one thing you really need for virtualisation is a shed load of memory.

On acquiring a new server with 3 Gibibytes of memory,  the vast possibilities of all that memory opened out seemingly endless vistas of virtual servers. I could create nearly 4 virtual machines with the equivalent of my old server plus one with 512M.

Fortunately, some kind debian kernel developer managed to port the XEN hypervisor code to the 2.6.26 debian kernel in lenny. It was a late addition to the code base. Xen only officially support 2.6.18 or 19, which is old by any standards. Except for some issues with system clocks, the port has worked well for me. Well done debian.

But most of the virtual images that I am likely to run would fit into significantly less memory than the 640 Mibibytes of my old server. On my old home server the largest executable running is a mere 72 Mibibytes. That is squid, and it is only using that much memory because I told it it could.

Even apache v2’s various threads are only using a total of 130 Mibibytes, and, again, that is only because I said it could.

The more heavyweight users, that are less easy to deal with are:

  • clamd 68 Mibibytes (ouch)
  • Gnome-terminal uses 11 Mibibytes, compared with xterm’s 2.6 (!)
  • Iceweasel (firefox for debian users) weighs in at 26 Mibibytes
  • etc.

In short, if I were to be selective about what to run on my virtual machines, I could probably get away with much less than the 640 Mibibytes that my old server uses. Even if I made a bad call, a huge advantage that Xen has over other similar virtualisation solutions is that you can dynamically re-assign memory, and even processors. The virtualised servers will hardly notice!

Although this description does concern the building of a firewall virtual server, I have left most of the firewall configuration detail for the next part of the series. Most of the instruction and information concerns what would need to be done to build any virtual server. All that would need to change is the config file, and the virtual disk.

The new server has two onboard network interfaces. One Gigabit, sutiable for connecting to the LAN, which at least partially runs at gigabit speed. The second is a Fast Ethernet interface, which is sutiable for the WAN connection.

When Xen starts, on the dom0 host, the one with the phyical ethernet cards, the default behaviour, is to take the default ethernet interface, eth0, and rename it to peth0. (the P standing for physical). Then it creates a virtual interface vif0, and creates a bridge consisting of peth0 and vif0, which it calls eth0. This is done for a few reasons:

  1. The user doesn’t see the default network interface name change.
  2. Any firewall rules that apply to eth0, do not effect the virtual machine.
  3. If the user is not terribly familiar with ethernet networking, bridging etc. it is a path of least resistance. Most people will not come unstuck unless they know what they are doing.

This article on the xen.org website has a lot of helpful information on how networking works with Xen. In fact, it seems to suggest that you can create just about any kind of virtual network within a set of xen virtual machines that you like. And it will just work. This gave me hope that what I was trying to do would work.

At this point, I feel I should point out that this is not a high performance configuration. If you are working with an enterprise, running a Class-A subnet from multiple gigabit internet connections, the number of layers and bottle knecks will almost certainly cause all you expensive bandwidth to ebb away. For a home or small business configuration with a fairly small 8 Megabit ADSL connection, the extra delays and processing effort is negligible. I would say, that if you are trying to serve more than one broadcast domain (and cisco say a layer 2 broadcast domain should be smaller than 500 machines, I say, probably smaller that that), then this solution isn’t for you.

Working from a minimal debian install, with plenty of available disk space, install the xen-linux-system package for your architecture. This pulls in the hypervisor and the relevant linux image. and the xen-utils package. It is also worth getting the libc6-xen version of glibc, as it is optimised to work better on a xen system. In the file /etc/xen/xend-config.sxp, add the following line (somewhare around the network configuration part):

(network-script network-bridge)

Reboot, and your system should restart with the xen hypervisor lurking in the background.

First up, create a logical volume on your storage. I have a single volume group, generally called “system”. Perhaps I am odd. I put everything except /boot into a logical volume. It means you can almost completely forget about partitions, and using some LVM jiggery pokery, resize any partition if you find you made some bad calls early on.

Because this machine will be the firewall is basically a very small debian install, I have allocated it 2 Gigabytes of disk space:

#lvcreate -L2048M -n firewall system

This should be more than enough for the base install, and a few firewalling applications.

Building the virtual server.

A really handy guide to doing this is the debian xen wiki. The debian installer now supports installing into a virtual server, and it is pretty much as easy as installing onto a standalone machine.

You basically download an example xm-debian.cfg file, edit it a bit, to identify block devices, networks, memory etc. and run it with a install options.

Of interest, the changes made to the .cfg file:

memory = 128
name = "firewall"
vcpus=1
vif = ['mac=00:16:3e:42:39:18, bridge=eth1', 'mac=00:16:3e:42:39:19, bridge=eth0']
disk = ['phy:/dev/mapper/system-firewall,xvda,w']

I also modified the bootloader line to have the full path to pygrub, otherwise the system doesn’t properly find pygrub, and it is impossible to boot your new instance.

bootloader="/usr/lib/xen-3.2-1/bin/pygrub"

Then, simply invoke the xm create command with the appropriate switches:

xm create -c xm-debian.cfg install=true install-mirror=http://ftp.uk.debian.org/debian install-extra="clocksource=jiffies \
  priority=low" install-suite=lenny

Of importance are the “clocksource” line, which is fed to the kernel. Without this you may get the dreaded “clocksource/0: Time went backwards” error. Although the debian wiki says that you need to add some other bits, I didn’t find them necessary. In fact, independent_wallclock means that if you reboot the dom0, or pause your domU it will come back with a slow system clock. Most unsatisfactory. If the error exceeds an hour, then ntp will not correct the clock. The clocksource jiffies setting seemed to solve the problem for me, and it can even be applied at run time.

Priority=low is the expert install mode. You may not wish to use this.

The standard debian installer starts, and you can configure your debian in the usual way.

If you are not familiar with installing debian, have a look at the reams of documentation available on the debian.org website.

If you use LVM in the domU, make sure that you use a different volume group name than that in use by the dom0 (I used fwsystem). This allows you to perform low level maintenance on the domU filesystem from dom0. Maybe more on how to do this in a future different post.

Make sure you install grub. Although you can’t actually boot from the virtual machine using grub, pygrub uses the menu.lst file that grub creates within the domU filesystem. This is elegant, as kernel updates can be applied within the virtual server, and not need manual intervention to copy kernel images and initial ram disk images onto dom0.

For this installation, you have, of course, to allow the firewall access to the interweb. This is obtained by simply configuring one of the interfaces and allowing it to retrieve the necessary install files.

In addion to the base install, I installed tshark, bridge-utils, ebtables and arptables. No other optional packages were installed. Once the installation has completed. The xen instance will stop. It can be restarted, in normal mode with the following command:

xm create -c xm-debian.cfg

You can, of course, trim the file to suit your needs. There is no need for the install cruft that is in there to stay there. However it will work on a day to day basis.

You should see the boot process and end up with a console prompt.

Add the text “clocksource=jiffies” to the default kernel options in /boot/grub/menu.list, and either reboot or:

echo "jiffies"> /sys/devices/system/clocksource/clocksource0/current_clocksource

Debian, by default, will start any virtual machines that were running when the dom0 was shutdown. It saves a state in /var/lib/xen/save (make sure your /var partition is big enough for all your virtual machine’s memory images) and will restart any images that it finds in there.

To ensure that the virtual machine starts every time the dom0 boots, even if the dom0 was shutdown ungracefully, copy or (more properly) link the .cfg file into /etc/xen/auto/

You can of course reboot the dom0 before many network connections are dropped from the domU instances. For example a top(1) process can be running in a terminal window, via a ssh session. The display output will pause while the dom0 is rebooted, and then resume exactly where it left off. Very cool.

Next time, part 3: Configuring a layer two internet firewall.

Obsesive monitoring

June 4, 2009 by taskme

A few years ago, a colleague introduced me to mrtg. Mrtg was originally designed to query SNMP routers to establish the bandwidth usage. With a few tweaks it can be configured to monitor anything that be converted to a numeric value. Disk space, number of processes running, temperatures, voltages etc.

Mrtg has two major drawbacks:

  1. Vanilla mrtg can only monitor integer values
  2. It is designed to work with two values only – in and out.

cacti can do all that mrtg can, and much more. It is an absolute pig to configure, lots of non-intuitive settings, little logic, poor defaults. But it can make some nice graphs, with floating point values, and with colours, with a very high level of customisation.

With lm-sensors and apcupsd, not only can you monitor your network, but you can also monitor voltages, temperatures and much more.

Cacti can be extended by using your own scripts. If you can write a script for it, you can monitor it.

Eg:

#!/usr/bin/perl
# Display UPS data
#

@collect=("LINEV","LOADPCT","BCHARGE","TIMELEFT","MAXLINEV","MINLINEV",
	  "OUTPUTV","ITEMP","BATTV","LINEFREQ","LOTRANS","HITRANS");

foreach $_ (`/sbin/apcaccess status`) {
  ($line, $value) = split(/:/);
  chomp($value);
  foreach $val (@collect) {
    if ( index($line,$val) == 0) {
      @number = split / +/,$value;
      print "$val:$number[1] ";
    }
  }
}
print "\n";

Typical output is:

LINEV:240.5 LOADPCT:18.7 BCHARGE:100.0 TIMELEFT:73.0 MAXLINEV:241.8 MINLINEV:239.2 OUTPUTV:240.5
   LOTRANS:196.0 HITRANS:253.0 ITEMP:37.8 BATTV:55.6 LINEFREQ:50.0

which is basically a load of name value pairs. Cacti calls this script every five minutes, and extracts the values from the string, and stores them in a round-robin database.

The different graphs are generally interesting over different time periods. Temperatures show daily and yearly fluctuations. Network usage seems pretty much random, although you can identify big downloads months afterwards.

Some interesting graphs are included below…

For example, free space on my /home drive:

home space usage

home space usage

As you can see, I keep my home space small. This allows me to back it up relatively easily, although, it is quite difficult to keep it so low. About a month ago, I gave up and added the rest of the available space on my MD RAID partition. It seems to bob along about 5 Gigabytes free, just enough space to download a knoppix DVD image.

Another interesting one is the mains voltage.

Mains voltage over two years

Mains voltage over two years

Daily fluctuations in mains

Daily fluctuations in mains

The first graph is moderately interesting because of the step at the end of January. The second is less interesting but  you can see how much it varies in a day.

For a few years, my UPS kept tripping out in the winter with “over voltage”, after some research I discovered that the electricity board have to provide electricty at 230v + 10% or -6%, the over voltage switch over for my UPS was at the same point that the incoming electricity became illegally high – 253 volts. (the top red line on the graph)

It may have been that my UPS was a bit over sensitive. The graphs only show average voltage for the time, not peak voltage, which is why they don’t appear to cross the upper red line limit.

Although the UPS was protecting my IT hardware, and some other bits and bobs, I felt that the constant tripping of my UPS would be reducing its life, as it is going onto battery several times a day. Any equipment not protected by the UPS would also be vulnerable to over voltages, so I contacted the electricity company. They installed a line monitor for a week, and confirmed that the voltage had gone over 252v twice in that period. So my UPS, despite complaining 4-6 times a day, did have a valid reason for complaint.

The downward step was caused when the electricity company dropped the local voltage my moving the supply tap one loop on the sub-station transformer, thus reducing the local voltage. Although I thought electricity in the UK had to be supplied at 230v, most places, it seems, are still configured to run at the traditional 240v, despite the official change being made over 15 years ago.

As you can see, the voltage dropped by nearly 8 volts, and my UPS stopped complaining, so a positive result, and proof that you can get things changed for the better. Complaining works!

From my CPU fan speed monitor, you can see when I clean out the case.

Fan speed over two year

Fan speed over two years

December, a year and a half ago the fan was so choked up, it was starting to fail. I was expecting to have to replace it but the clean out revitalised its fortunes. Next time, the following October, you can see a step as the fan turned more easily with less dust in it. I try to clean out the machine at least once a year.

The final graph is the 12 v graph. It is quite boringly flat. I find it impressive that that power supply that must be over 8 years old, in use 24/7 has managed to supply a such a consistent voltage for at least 2 years.

Constantly boring

Constantly boring

New home server: Part 1 hardware

June 4, 2009 by taskme

As a true techy, I find that the best way to manage my IT is to have a home server. This is a machine that I leave on 24/7, which provides a central place for emails and files that I can use on my desktop, laptop, other desktop, media PC and from work (if necessary).

I have had a home server since before 1998. This particular one has survived, more or less unmodified since 2001, when it upgraded a Pentium 90. My home servers always run a version of Debian. With each passing Debian version, Woody, Sarge, Etch, each time a new stable arrives it is upgraded. I expect the root file system is a very old version of ext3. The hardware was pretty much End of Life when it was bought, so I think I got my moneys worth. Its main responsibilities are:

  1. Internet firewall and transparent squid proxy.
  2. DHCP, DNS services for the LAN.
  3. SMTP/IMAP email – IMAP is great because you get the same view of your email from any machine running any operating system.
  4. NFS and SMB for file serving. Again, it is really useful that you can access the same files from any machine and any operating system.
  5. Apache for miscellaneous files at work, and the fantastic squirrelmail
  6. Recently, openvpn, for secure access from my laptop.
  7. Any generic shell/processing that needs to be done.

As it has seemed to be slower and slower, I have decided to upgrade it. Retrofitting LVM and addional memory has, to a certain extent, extended it’s life, but certain things seem to run so slowly on it, particularly virus/spam scanning, and it’s memory, being the most obscure RIMM memory, means that it is beginning to seem short of it. Backing RIMM was intel’s biggest gaff of the late 1990s.

Specification:

  1. Early 1.4 GHz P4 (PGA423)
  2. Asus P4T motherboard
  3. Memory – 2 x 64Mbyte RIMM plus 2 x 256 RIMM (total 640 Mbytes)
  4. Disks – 1 x 160Gbyte IDE, 1 x 160Gbyte SATA, 2 x 250 Gbyte SATA (SATA drives on an IDE controller)
  5. Video Nvidia TNT 2 with S-Video output, particularly useful as it means I can plug it into the TV rather than dragging a monitor around.

With my budget of £0 I had to rely on alternative means to obtain an updated system. As a member of a local linux user group I recently acquired a couple of old servers. These are 1U supermicro Xeon based systems, about 5 years old. Not the latest and greatest but certainly a reasonable upgrade from before. Specification:

  1. 2 x 2.4 GHz hyperthreading Xeons
  2. Supermicro P4DPL motherboard
  3. Memory 6 x 512 ECC SDRAM – total 3GBytes (wow, so much memory)
  4. Drives 180 GByte IDE (but will also get the current server’s drives)
  5. Onboard RAGE XL Video, not so convenient as a card with a S-Video output. But with monitors being smaller and easier to lug around, I think I can live with it.

Unfortunately, being 1U and without dynamic cooling management, these boys are loud. Think along the lines of a 747 cranking up. Although they would fit right at home in a server room, they are not well suited to home use. 1st job, therefore, to find a quieter solution.

This means, an alternative case, and replacement CPU coolers (the stock ones are simple finned heat syncs and centrifugal fans – that look a bit like a snail. They are stuck in front of the CPUs, separate from the motherboard, to keep the height within the confines of a 1U case but they are noisy!)

Being servers, the motherboards are E-ATX, and are nearly 13″ wide. This is an issue for most midi cases, as the CPU coolers would not fit behind the drive bays. However, another member of my local linux group was giving away an old full tower case, which, also supermicro, had enough clearance to fit the motherboard in.  And a quick visit to ebay found me some PGA603 coolers for £10 each. Ok so that is £20 over budget, but a small sacrifice.

A spare 450W ATX PSU and we’re in business. And considerably quieter to boot.

Next time: Part 2, XEN and the art of virtualization.