Subject: Re: maintaining many Xen VMs
To: Steven M. Bellovin <smb@cs.columbia.edu>
From: Johnny Lam <jlam@pkgsrc.org>
List: port-xen
Date: 10/23/2006 12:01:53
This is a multi-part message in MIME format.
--------------050006080904080207080905
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Steven M. Bellovin wrote:
> I recently brought up 3 DomUs; the result, of course, is that I now have 3
> more machines to administer. This is known as a bad tradeoff... I'm
> curious, though, how other people are solving this.
>
> My (NetBSD) DomUs are going to be mostly identical. I was thinking of
> having a shared, read-only /usr and separate /var. I probably need
> separate roots, if only to have separate /etc/rc.conf files. /usr will be
> a real partition, probably shared with the Dom0. The Dom0 would also have
> a separate partition that held the vnds for the DomUs.
>
> The problem is updating pkgsrc -- compilations on the Dom0 (with the DomUs
> shut down) would be slow, since I'm not allocating much phyiscal memory to
> Dom0.
>
> Anyway -- how are other people handling this? I thought about NFS, but I
> suspect it's too slow.
I use disk images mounted on vnd devices for my domUs. My domUs are
mostly the same, so what I do is:
1) Have one read-only root filesystem image that has a full NetBSD
installation (about 400Mb).
2) Create a separate disk image for the /usr/pkg filesystem per domU.
3) Create a separate disk image for the /local filesystem per domU.
These vary in size depending on the domU and provide local
storage space.
4) Null-mount /local/etc and /local/var over /etc and /var during
startup.
5) Use MFS for the /dev mount.
This lets you:
* Update the base install of NetBSD across all your domUs at once by
swapping in a new root.img. This is mostly only useful if you're
planning on tracking a minor or teeny release, not a major upgrade.
* Swap out the pkg.img with one with newer packages on it, and
quickly swap back when stuff breaks.
* Creatively make /local into a cgd filesystem, so only the local
data is encrypted while the packages and root filesystem aren't.
There are some modifications that need to be made to /etc in the
root.img file and also to /local/etc in the local.img files. I've
attached some quick-and-dirty scripts that I use to set up my domUs. I
offer them without much explanation, so you'll have to read through them
to figure out what's going on.
Cheers,
-- Johnny Lam <jlam@pkgsrc.org>
--------------050006080904080207080905
Content-Type: text/plain;
name="setup_root"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="setup_root"
#!/bin/sh
block_file=/usr/pkg/etc/xen/block-file
img=
kernel=netbsd-XENU
tmpdir=/tmp/$$
# Parse options
while [ $# -gt 0 ]; do
case "$1" in
-k) kernel=$2; shift 2 ;;
-*) echo 1>&2 "$0: unknown option \`\`$1''"; exit 1 ;;
*) break ;;
esac
done
if [ $# -lt 1 ]; then
echo 1>&2 "$0: missing image file"
exit 1
fi
img="$1"
###
### Check for necessary files before starting.
###
if [ ! -x "$block_file" ]; then
echo 1>&2 "$0: $block_file cannot be executed"
exit 1
fi
if [ ! -f "$img" ]; then
echo 1>&2 "$0: missing $img";
exit 1
fi
if [ ! -f "$kernel" ]; then
echo 1>&2 "$0: missing $kernel";
exit 1
fi
###
### Mount root.img onto /mnt for manipulation.
###
echo "Mounting $img"
dev=`$block_file bind $img`
vnd="${dev#/dev/}"; vnd="${vnd%[a-z]}"
mnt=$tmpdir/mnt
mkdir -p $mnt
mount /dev/${vnd}a $mnt
###
### Put the XENU kernel into place.
###
echo "Copying XENU kernel into $img"
rm -f $mnt/netbsd
cp -f $kernel $mnt
case "$kernel" in
netbsd) ;;
*) ln -f $mnt/$kernel $mnt/netbsd ;;
esac
###
### Prepare for MFS /dev. Modify the "init" target of the MAKEDEV script
### to create a few more virtual disk devices and also the power devices.
###
echo "Preparing MFS /dev"
makedev=$mnt/dev/MAKEDEV
if [ -f $makedev ]; then
rm -rf $mnt/dev/[a-z]*
[ -f $makedev.orig ] || mv -f $makedev $makedev.orig
awk '/makedev xbd0 xbd1 xencons/ {
sub("xencons", "xbd2 xbd3 xbd4 xencons");
print $0;
print " makedev sysmon";
print " makedev clockctl";
print " makedev ipl pf crypto systrace";
print " makedev tun0 tun1 tun2 tun3";
print " makedev tap tap0 tap1 tap2 tap3";
print " makedev kttcp";
next;
}
/makedev st0/ {
print " makedev vnd0"
next
}
/makedev iop0/ { next }
/makedev ed0/ { next }
/makedev ld0/ { next }
{ print }' \
$makedev.orig > $makedev
[ ! -x $makedev.orig ] || chmod +x $makedev
fi
###
### Create directories on which we intend to mount additional filesystems.
### emul emulation shadow directories
### local filesystem mounted on cgd(4) device
### usr/pkg mount point for pkgsrc-installed software
###
echo "Populating filesystem directories"
( cd $mnt && mkdir -p emul local usr/pkg )
###
### Add hook to mount /local/etc onto /etc so that our system-specific
### configuration is used. We insert the hook at the start of the rc
### script just before we source any other files.
###
echo "Creating local rc hook"
cat > $mnt/etc/rc.pre-hooks << 'EOF'
# rc.pre-hooks
# Mount /local to get the local configuration data. We must fsck the
# partition beforehand to ensure that it's clean because we won't be
# able to fix any problems later on after other filesystems are mounted.
#
/sbin/fsck -p /dev/rxbd2a
case $? in
0) ;;
*) if [ "$1" = autoboot ]; then
kill -TERM $$
fi
exit 1
;;
esac
if ! mount -t ffs /dev/xbd2a /local; then
echo "Unable to mount /local. Multiuser boot aborted."
exit 1
fi
# Mount /etc to get the real configuration files.
if ! mount -t null /local/etc /etc; then
echo "Unable to mount /etc. Multiuser boot aborted."
exit 1
fi
# Re-exec /etc/rc so that we use the correct configuration information,
# e.g., /etc/rc.conf settings, etc.
#
exec /bin/sh $0
EOF
rc=$mnt/etc/rc
if [ -f $rc ] && ! grep -q "/etc/rc.pre-hooks" $rc; then
mv -f $rc $rc.orig
awk '/^\. \/etc\/rc.subr$/ {
print ". /etc/rc.pre-hooks"
}
{ print }' \
$rc.orig > $rc
[ ! -x $rc.orig ] || chmod +x $rc
fi
###
### Cleanup
###
echo "Unmounting $img"
umount $mnt
rmdir $mnt
rmdir $tmpdir
$block_file unbind $dev
--------------050006080904080207080905
Content-Type: text/plain;
name="setup_local"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="setup_local"
#!/bin/sh
block_file=/usr/pkg/etc/xen/block-file
block_cgd=/usr/pkg/etc/xen/block-cgd
img_root=
img_local=
keyfile=
tmpdir=/tmp/$$
# Parse options
while [ $# -gt 0 ]; do
case "$1" in
-c) keyfile="$2"; shift 2 ;;
-*) echo 1>&2 "$0: unknown option \`\`$1''"; exit 1 ;;
*) break ;;
esac
done
if [ $# -lt 2 ]; then
echo 1>&2 "$0: missing image files"
exit 1
fi
img_root="$1"
img_local="$2"
###
### Check for necessary files before starting.
###
if [ ! -x "$block_file" ]; then
echo 1>&2 "$0: $block_file cannot be executed"
exit 1
fi
if [ -n "$keyfile" -a ! -x "$block_cgd" ]; then
echo 1>&2 "$0: $block_cgd cannot be executed"
exit 1
fi
###
### Check for necessary files before starting.
###
if [ ! -f "$img_root" ]; then
echo 1>&2 "$0: missing $img_root";
exit 1
fi
if [ ! -f "$img_local" ]; then
echo 1>&2 "$0: missing $img_local";
exit 1
fi
###
### Mount root.img onto /mnt.
###
echo "Mounting $img_root"
dev_root=`$block_file bind $img_root`
vnd_root="${dev_root#/dev/}"; vnd_root="${vnd_root%[a-z]}"
mnt_root=$tmpdir/mnt
mkdir -p $mnt_root
mount -r /dev/${vnd_root}a $mnt_root
###
### Mount local image onto /local.
###
echo "Mounting $img_local"
dev_local=`$block_file bind $img_local`
vnd_local="${dev_local#/dev/}"; vnd_local="${vnd_local%[a-z]}"
if [ -n "$keyfile" ]; then
dev_cgd=`$block_cgd bind /dev/${vnd_local}a $keyfile`
cgd="${dev_cgd#/dev/}"; cgd="${cgd%[a-z]}"
fi
mnt_local=$tmpdir/local
mkdir -p $mnt_local
if [ -n "$keyfile" ]; then
mount /dev/${cgd}a $mnt_local
else
mount /dev/${vnd_local}a $mnt_local
fi
###
### Copy "etc", "tmp" and "var" into place.
###
echo "Copying mutable directories from $img_root to $img_local"
( cd $mnt_root && pax -rwpe etc tmp var $mnt_local/. )
###
### Restore the original /etc/rc script.
###
echo "Restoring vanilla rc script"
if [ -f $mnt_local/etc/rc.orig ]; then
rm -f $mnt_local/etc/rc.pre-hooks
mv -f $mnt_local/etc/rc.orig $mnt_local/etc/rc
fi
###
### Add configuration bits to re-mount /local as read-write from
### /etc/rc.d/root.
###
echo "Add bit to re-mount /local as read-write"
cat > $mnt_local/etc/rc.conf.d/root << 'EOF'
start_postcmd="root_poststart"
root_poststart()
{
# Re-mount /local as a read-write filesystem.
mount -uw /local
}
EOF
###
### Add additional filesystems into /local/etc/fstab.
###
echo "Add our filesystems to /etc/fstab"
fstab=$mnt_local/etc/fstab
[ -f $fstab.orig ] || mv -f $fstab $fstab.orig
cat > $fstab << 'EOF'
/dev/xbd0a / ffs rw 1 1
/dev/cgd0b none swap sw 0 0
/dev/xbd2a /local ffs rw 0 0
/dev/xbd3a /usr/pkg ffs rw 1 2
/local/etc /etc null rw
/local/tmp /tmp null rw
/local/var /var null rw
kernfs /kern kernfs rw
procfs /proc procfs rw,noauto
EOF
###
### Edit the vanilla rc.conf script to add the extra bits for the domU setup.
###
echo "Modify rc.conf for domU setup"
rc_conf=$mnt_local/etc/rc.conf
if [ -f $rc_conf ] && ! grep -q "critical_filesystems_local" $rc_conf; then
mv -f $rc_conf $rc_conf.orig
awk '/^wscons=/ {
print "critical_filesystems_local=\"/local /etc /tmp /var\"";
print "powerd=YES";
print "savecore=NO";
print "sendmail=NO";
print "wscons=NO";
next;
}
{ print }' \
$rc_conf.orig > $rc_conf
fi
###
### Set up encrypted swap.
###
echo "Set up encrypted swap for the domU"
cat > $mnt_local/etc/cgd/cgd.conf << 'EOF'
cgd0 /dev/xbd1a
EOF
cat > $mnt_local/etc/cgd/xbd1a << 'EOF'
algorithm blowfish-cbc;
iv-method encblkno;
keylength 128;
verify_method none;
keygen urandomkey;
EOF
cat > $mnt_local/etc/rc.conf.d/cgd << 'EOF'
swap_device="cgd0"
swap_disklabel="/etc/cgd/xbd1a.disklabel"
start_postcmd="cgd_swap"
cgd_swap()
{
# Convert this dedicated swap device to contain one big swap
# partition.
#
disklabel $swap_device 2>/dev/null |
while read line; do
case "$line" in
d:*) bline=" b:${line#d:}"
bline="${bline%%unused*} swap${bline##*unused}"
echo "$bline"
echo "$line"
;;
[a-z]:*) ;;
*) echo "$line"
;;
esac
done > $swap_disklabel
if ! disklabel -R -r $swap_device $swap_disklabel 2>/dev/null; then
echo 1>&2 "Could not write $swap_disklabel to $swap_device"
fi
}
EOF
###
### Only turn on the console tty.
###
echo "Disable non-console ttys"
ttys=$mnt_local/etc/ttys
[ -f $ttys.orig ] || mv -f $ttys $ttys.orig
sed -e "/^tty/{s,on secure,off secure,;}" $ttys.orig > $ttys
###
### Disable the daily and weekly checks in root's crontab.
###
echo "Disable daily and weekly checks in root's crontab"
crontab=$mnt_local/var/cron/tabs/root
awk '/^[^#].*\/etc\/(daily|weekly)/ { print "#" $0; next } { print }' \
$crontab > $crontab.new
mv -f $crontab.new $crontab
###
### Cleanup
###
echo "Unmounting $img_root and $img_local"
umount $mnt_root
umount $mnt_local
rmdir $mnt_root
rmdir $mnt_local
rmdir $tmpdir
$block_file unbind $dev_root
[ -z "$keyfile" ] || $block_cgd unbind $dev_cgd
$block_file unbind $dev_local
--------------050006080904080207080905--