Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: maintaining many Xen VMs



Steven M. Bellovin wrote:
I recently brought up 3 DomUs; the result, of course, is that I now have 3
more machines to administer.  This is known as a bad tradeoff...  I'm
curious, though, how other people are solving this.

My (NetBSD) DomUs are going to be mostly identical.  I was thinking of
having a shared, read-only /usr and separate /var.  I probably need
separate roots, if only to have separate /etc/rc.conf files.  /usr will be
a real partition, probably shared with the Dom0.  The Dom0 would also have
a separate partition that held the vnds for the DomUs.

The problem is updating pkgsrc -- compilations on the Dom0 (with the DomUs
shut down) would be slow, since I'm not allocating much phyiscal memory to
Dom0.

Anyway -- how are other people handling this?  I thought about NFS, but I
suspect it's too slow.

I use disk images mounted on vnd devices for my domUs. My domUs are mostly the same, so what I do is:

   1) Have one read-only root filesystem image that has a full NetBSD
      installation (about 400Mb).

   2) Create a separate disk image for the /usr/pkg filesystem per domU.

   3) Create a separate disk image for the /local filesystem per domU.
      These vary in size depending on the domU and provide local
      storage space.

   4) Null-mount /local/etc and /local/var over /etc and /var during
      startup.

   5) Use MFS for the /dev mount.

This lets you:

   * Update the base install of NetBSD across all your domUs at once by
     swapping in a new root.img.  This is mostly only useful if you're
     planning on tracking a minor or teeny release, not a major upgrade.

   * Swap out the pkg.img with one with newer packages on it, and
     quickly swap back when stuff breaks.

   * Creatively make /local into a cgd filesystem, so only the local
     data is encrypted while the packages and root filesystem aren't.

There are some modifications that need to be made to /etc in the root.img file and also to /local/etc in the local.img files. I've attached some quick-and-dirty scripts that I use to set up my domUs. I offer them without much explanation, so you'll have to read through them to figure out what's going on.

        Cheers,

        -- Johnny Lam <jlam%pkgsrc.org@localhost>
#!/bin/sh

block_file=/usr/pkg/etc/xen/block-file
img=
kernel=netbsd-XENU
tmpdir=/tmp/$$

# Parse options
while [ $# -gt 0 ]; do
        case "$1" in
        -k)     kernel=$2; shift 2 ;;
        -*)     echo 1>&2 "$0: unknown option \`\`$1''"; exit 1 ;;
        *)      break ;;
        esac
done

if [ $# -lt 1 ]; then
        echo 1>&2 "$0: missing image file"
        exit 1
fi

img="$1"

###
### Check for necessary files before starting.
###
if [ ! -x "$block_file" ]; then
        echo 1>&2 "$0: $block_file cannot be executed"
        exit 1
fi
if [ ! -f "$img" ]; then
        echo 1>&2 "$0: missing $img";
        exit 1
fi
if [ ! -f "$kernel" ]; then
        echo 1>&2 "$0: missing $kernel";
        exit 1
fi

###
### Mount root.img onto /mnt for manipulation.
###
echo "Mounting $img"
dev=`$block_file bind $img`
vnd="${dev#/dev/}"; vnd="${vnd%[a-z]}"
mnt=$tmpdir/mnt
mkdir -p $mnt
mount /dev/${vnd}a $mnt

###
### Put the XENU kernel into place.
###
echo "Copying XENU kernel into $img"
rm -f $mnt/netbsd
cp -f $kernel $mnt
case "$kernel" in
netbsd) ;;
*)      ln -f $mnt/$kernel $mnt/netbsd ;;
esac

###
### Prepare for MFS /dev.  Modify the "init" target of the MAKEDEV script
### to create a few more virtual disk devices and also the power devices.
###
echo "Preparing MFS /dev"
makedev=$mnt/dev/MAKEDEV
if [ -f $makedev ]; then
        rm -rf $mnt/dev/[a-z]*
        [ -f $makedev.orig ] || mv -f $makedev $makedev.orig
        awk '/makedev xbd0 xbd1 xencons/ {
                sub("xencons", "xbd2 xbd3 xbd4 xencons");
                print $0;
                print " makedev sysmon";
                print " makedev clockctl";
                print " makedev ipl pf crypto systrace";
                print " makedev tun0 tun1 tun2 tun3";
                print " makedev tap tap0 tap1 tap2 tap3";
                print " makedev kttcp";
                next;
             }
             /makedev st0/ {
                print " makedev vnd0"
                next
             }
             /makedev iop0/ { next }
             /makedev ed0/ { next }
             /makedev ld0/ { next }
             { print }' \
                $makedev.orig > $makedev
        [ ! -x $makedev.orig ] || chmod +x $makedev
fi

###
### Create directories on which we intend to mount additional filesystems.
###     emul    emulation shadow directories
###     local   filesystem mounted on cgd(4) device     
###     usr/pkg mount point for pkgsrc-installed software
###
echo "Populating filesystem directories"
( cd $mnt && mkdir -p emul local usr/pkg )

###
### Add hook to mount /local/etc onto /etc so that our system-specific
### configuration is used.  We insert the hook at the start of the rc
### script just before we source any other files.
###
echo "Creating local rc hook"
cat > $mnt/etc/rc.pre-hooks << 'EOF'
# rc.pre-hooks

# Mount /local to get the local configuration data.  We must fsck the
# partition beforehand to ensure that it's clean because we won't be
# able to fix any problems later on after other filesystems are mounted.
#
/sbin/fsck -p /dev/rxbd2a
case $? in
0)      ;;
*)      if [ "$1" = autoboot ]; then
                kill -TERM $$
        fi
        exit 1
        ;;
esac
if ! mount -t ffs /dev/xbd2a /local; then
        echo "Unable to mount /local.  Multiuser boot aborted."
        exit 1
fi

# Mount /etc to get the real configuration files.
if ! mount -t null /local/etc /etc; then
        echo "Unable to mount /etc.  Multiuser boot aborted."
        exit 1
fi

# Re-exec /etc/rc so that we use the correct configuration information,
# e.g., /etc/rc.conf settings, etc.
#
exec /bin/sh $0
EOF

rc=$mnt/etc/rc
if [ -f $rc ] && ! grep -q "/etc/rc.pre-hooks" $rc; then
        mv -f $rc $rc.orig
        awk '/^\. \/etc\/rc.subr$/ {
                print ". /etc/rc.pre-hooks"
             }
             { print }' \
                $rc.orig > $rc
        [ ! -x $rc.orig ] || chmod +x $rc
fi

###
### Cleanup
###
echo "Unmounting $img"
umount $mnt
rmdir $mnt
rmdir $tmpdir
$block_file unbind $dev
#!/bin/sh

block_file=/usr/pkg/etc/xen/block-file
block_cgd=/usr/pkg/etc/xen/block-cgd
img_root=
img_local=
keyfile=
tmpdir=/tmp/$$

# Parse options
while [ $# -gt 0 ]; do
        case "$1" in
        -c)     keyfile="$2"; shift 2 ;;
        -*)     echo 1>&2 "$0: unknown option \`\`$1''"; exit 1 ;;
        *)      break ;;
        esac
done

if [ $# -lt 2 ]; then
        echo 1>&2 "$0: missing image files"
        exit 1
fi

img_root="$1"
img_local="$2"

###
### Check for necessary files before starting.
###
if [ ! -x "$block_file" ]; then
        echo 1>&2 "$0: $block_file cannot be executed"
        exit 1
fi
if [ -n "$keyfile" -a ! -x "$block_cgd" ]; then
        echo 1>&2 "$0: $block_cgd cannot be executed"
        exit 1
fi

###
### Check for necessary files before starting.
###
if [ ! -f "$img_root" ]; then
        echo 1>&2 "$0: missing $img_root";
        exit 1
fi
if [ ! -f "$img_local" ]; then
        echo 1>&2 "$0: missing $img_local";
        exit 1
fi

###
### Mount root.img onto /mnt.
###
echo "Mounting $img_root"
dev_root=`$block_file bind $img_root`
vnd_root="${dev_root#/dev/}"; vnd_root="${vnd_root%[a-z]}"
mnt_root=$tmpdir/mnt
mkdir -p $mnt_root
mount -r /dev/${vnd_root}a $mnt_root

###
### Mount local image onto /local.
###
echo "Mounting $img_local"
dev_local=`$block_file bind $img_local`
vnd_local="${dev_local#/dev/}"; vnd_local="${vnd_local%[a-z]}"
if [ -n "$keyfile" ]; then
        dev_cgd=`$block_cgd bind /dev/${vnd_local}a $keyfile`
        cgd="${dev_cgd#/dev/}"; cgd="${cgd%[a-z]}"
fi

mnt_local=$tmpdir/local
mkdir -p $mnt_local
if [ -n "$keyfile" ]; then
        mount /dev/${cgd}a $mnt_local
else
        mount /dev/${vnd_local}a $mnt_local
fi

###
### Copy "etc", "tmp" and "var" into place.
###
echo "Copying mutable directories from $img_root to $img_local"
( cd $mnt_root && pax -rwpe etc tmp var $mnt_local/. )

###
### Restore the original /etc/rc script.
###
echo "Restoring vanilla rc script"
if [ -f $mnt_local/etc/rc.orig ]; then
        rm -f $mnt_local/etc/rc.pre-hooks
        mv -f $mnt_local/etc/rc.orig $mnt_local/etc/rc
fi

###
### Add configuration bits to re-mount /local as read-write from
### /etc/rc.d/root.
###
echo "Add bit to re-mount /local as read-write"
cat > $mnt_local/etc/rc.conf.d/root << 'EOF'
start_postcmd="root_poststart"

root_poststart()
{
        # Re-mount /local as a read-write filesystem.
        mount -uw /local
}
EOF

###
### Add additional filesystems into /local/etc/fstab.
###
echo "Add our filesystems to /etc/fstab"
fstab=$mnt_local/etc/fstab
[ -f $fstab.orig ] || mv -f $fstab $fstab.orig
cat > $fstab << 'EOF'
/dev/xbd0a / ffs rw 1 1
/dev/cgd0b none swap sw 0 0
/dev/xbd2a /local ffs rw 0 0
/dev/xbd3a /usr/pkg ffs rw 1 2
/local/etc /etc null rw
/local/tmp /tmp null rw
/local/var /var null rw
kernfs /kern kernfs rw
procfs /proc procfs rw,noauto
EOF

###
### Edit the vanilla rc.conf script to add the extra bits for the domU setup.
###
echo "Modify rc.conf for domU setup"
rc_conf=$mnt_local/etc/rc.conf
if [ -f $rc_conf ] && ! grep -q "critical_filesystems_local" $rc_conf; then
        mv -f $rc_conf $rc_conf.orig
        awk '/^wscons=/ {
                print "critical_filesystems_local=\"/local /etc /tmp /var\"";
                print "powerd=YES";
                print "savecore=NO";
                print "sendmail=NO";
                print "wscons=NO";
                next;
             }
             { print }' \
                $rc_conf.orig > $rc_conf
fi

###
### Set up encrypted swap.
###
echo "Set up encrypted swap for the domU"
cat > $mnt_local/etc/cgd/cgd.conf << 'EOF'
cgd0    /dev/xbd1a
EOF

cat > $mnt_local/etc/cgd/xbd1a << 'EOF'
algorithm blowfish-cbc;
iv-method encblkno;
keylength 128;
verify_method none;
keygen urandomkey;
EOF

cat > $mnt_local/etc/rc.conf.d/cgd << 'EOF'
swap_device="cgd0"
swap_disklabel="/etc/cgd/xbd1a.disklabel"
start_postcmd="cgd_swap"

cgd_swap()
{
        # Convert this dedicated swap device to contain one big swap
        # partition.
        #
        disklabel $swap_device 2>/dev/null |
        while read line; do
                case "$line" in
                d:*)    bline=" b:${line#d:}"
                        bline="${bline%%unused*}  swap${bline##*unused}"
                        echo "$bline"
                        echo "$line"
                        ;;
                [a-z]:*) ;;
                *)      echo "$line"
                        ;;
                esac
        done > $swap_disklabel 
        if ! disklabel -R -r $swap_device $swap_disklabel 2>/dev/null; then
                echo 1>&2 "Could not write $swap_disklabel to $swap_device"
        fi
}
EOF

###
### Only turn on the console tty.
###
echo "Disable non-console ttys"
ttys=$mnt_local/etc/ttys
[ -f $ttys.orig ] || mv -f $ttys $ttys.orig
sed -e "/^tty/{s,on secure,off secure,;}" $ttys.orig > $ttys

###
### Disable the daily and weekly checks in root's crontab.
###
echo "Disable daily and weekly checks in root's crontab"
crontab=$mnt_local/var/cron/tabs/root
awk '/^[^#].*\/etc\/(daily|weekly)/ { print "#" $0; next } { print }' \
        $crontab > $crontab.new
mv -f $crontab.new $crontab

###
### Cleanup
###
echo "Unmounting $img_root and $img_local"
umount $mnt_root
umount $mnt_local
rmdir $mnt_root
rmdir $mnt_local
rmdir $tmpdir
$block_file unbind $dev_root
[ -z "$keyfile" ] || $block_cgd unbind $dev_cgd
$block_file unbind $dev_local


Home | Main Index | Thread Index | Old Index