Subject: Re: cluster install
To: None <tech-cluster@NetBSD.org>
From: Jan Schaumann <jschauma@netbsd.org>
List: tech-cluster
Date: 10/21/2003 22:28:45
--Pd0ReVV5GZGQvF3a
Content-Type: multipart/mixed; boundary="6c2NcOVqGQ03X4Wi"
Content-Disposition: inline


--6c2NcOVqGQ03X4Wi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

MLH <mlh@goathill.org> wrote:
> Jan Schaumann wrote:

> > Actually, we're using rsync

> What might be the chances of letting us see/work on what you have?

Sure, why not.  If you find anything odd or obviously stupid, I'll be
happy to take suggestions. :)

The basic setup is as follows:

Main file server (doppelbock):

(doppelbock) pwd
/usr/local/cluster-maint
(doppelbock) ls
NASTY_HACKS               etc-c009                  etc-c023
aftersync                 etc-c010                  etc-c024
aftersync.not             etc-c011                  etc-c025
aftersync.old             etc-c012                  etc-c026
aftersync.really.old      etc-c013                  etc-c027
beforesync.old            etc-c014                  etc-c028
bin                       etc-c015                  etc-c029
etc-c002                  etc-c016                  etc-c030
etc-c003                  etc-c017                  etc-c031
etc-c004                  etc-c018                  firstboot
etc-c005                  etc-c019                  logs
etc-c006                  etc-c020                  motd
etc-c007                  etc-c021                  rsync-excludes
etc-c008                  etc-c022
(doppelbock) cat NASTY_HACKS
/etc/rc.d/mountcritlocal paxes /var-ro over mfs mounted /var

/usr/local/node needs an up-to-date MAKEDEV
Problem: nodes mfs mount /dev/, so we can't update that version
Hack: hack /sbin/init to not use /dev/MAKEDEV and /dev/MAKEDEV.local
         but fetch it from /etc/ where we placed it (ie /usr/local/node/etc
        on doppelbock!)  In addition, /dev is created too small for our
        purposes, so we increased the size.
        See /usr/src/sbin/init/init.c around line 1405 and 1446
(doppelbock) ls etc-c002
inetd.conf rc.conf
(doppelbock) cat rsync-excludes=20
boot
kern
proc
share
var
dev
(doppelbock) ls bin
push.sh tool.sh
(doppelbock)

push.sh and tool.sh are attached -- they are used to push out an update
to all nodes.  The nodes image is in /usr/local/node on the file
server, with various shared directories located on the file server and
symlinked to from the nodes base system (users home directories are set
to /share/home/<username> in /etc/passwd et al).  Installing new
packages is thus as easy as on a single system:  simply go into
/usr/pkgsrc/category/package on doppelbock and run 'make install'.
Done.


(doppelbock) cd /usr/local/node
(doppelbock) ls -l usr/X11R6 usr/obj usr/pkg* usr/src var-ro/db/pkg
lrwxr-xr-x  1 root  wheel  12 Dec 15  2002 usr/X11R6 -> /share/X11R6
lrwxr-xr-x  1 root  wheel  10 Dec 15  2002 usr/obj -> /share/obj
lrwxr-xr-x  1 root  wheel  10 Dec 15  2002 usr/pkg -> /share/pkg
lrwxr-xr-x  1 root  wheel  13 Dec 15  2002 usr/pkgsrc -> /share/pkgsrc
lrwxr-xr-x  1 root  wheel  10 Dec 15  2002 usr/src -> /share/src
lrwxr-xr-x  1 root  wheel  13 Dec 15  2002 var-ro/db/pkg -> /share/db/pkg
(doppelbock) diff -bu etc/rc.d/mountcritlocal /etc/rc.d/mountcritlocal
--- etc/rc.d/mountcritlocal     2003-08-27 18:04:57.000000000 -0400
+++ /etc/rc.d/mountcritlocal    2002-09-08 15:33:33.000000000 -0400
@@ -20,10 +20,6 @@
        #
        mount_critical_filesystems local
=20
-       # XXX: local change
-       # /var is mfs mounted over a read-only /var-ro
-       cd /var-ro; pax -rwp e . /var
-
        #       clean up left-over files.
        #       this could include the cleanup of lock files and
        #       /var/run, etc.
        #
(doppelbock) cd /usr/src/sbin/init
(doppelbock) cvs diff -bu
cvs server: Diffing .
Index: init.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sbin/init/init.c,v
retrieving revision 1.61
diff -b -u -r1.61 init.c
--- init.c      2003/08/07 10:04:25     1.61
+++ init.c      2003/10/22 02:08:11
@@ -1401,8 +1401,8 @@
        /* Mount an mfs over /dev so we can create devices */
        switch ((pid =3D fork())) {
        case 0:
-               (void)execl(INIT_MOUNT_MFS, "mount_mfs", "-i", "192",
-                   "-s", "768", "-b", "4096", "-f", "512", "swap", "/dev",
+               (void)execl(INIT_MOUNT_MFS, "mount_mfs", "-i", "128",
+                   "-s", "10240", "-b", "4096", "-f", "512", "swap", "/dev=
",
                    NULL);
                _exit(1);
                /*NOTREACHED*/
@@ -1443,7 +1443,11 @@
        switch ((pid =3D fork())) {
        case 0:
                if (chdir("/dev") =3D=3D 0)
+#if 0
                        (void)execl(INIT_BSHELL, "sh", "./MAKEDEV", "init",
+#else
+                       (void)execl(INIT_BSHELL, "sh", "/etc/MAKEDEV", "all=
",
+#endif
                            NULL);=20
                _exit(1);
=20
(doppelbock)


That's all I can think of right now.  We currently don't have a decent
automatic procedure to drop in new nodes -- the last few times I had to
reinstall a node I actually took the node, opened it up, added a CD
drive, booted off an install CD, disklabel, newfs, nfs mount from
doppelbock and pax things over.  Obviously, I, too, would be interested
in a decent automatic-install tool.

Hope you find this information useful.  If you have ideas of how to do
things more elegantly, please let me know.

-Jan

--=20
Wenn ich tot bin, mir soll mal Einer mit Auferstehung oder so
kommen, ich hau ihm eine rein! (Anonym)

--6c2NcOVqGQ03X4Wi
Content-Type: application/x-sh
Content-Disposition: attachment; filename="push.sh"
Content-Transfer-Encoding: quoted-printable

#!/bin/sh=0A#=0A# rsync the nodes with /usr/local/node=0A#set -x=0A=0ARSYNC=
_RSH=3Drsh=0ABASEDIR=3D/usr/local=0AMAINTDIR=3D${BASEDIR}/cluster-maint=0AN=
ODESROOT=3D${BASEDIR}/node=0A=0ARSYNC=3D"rsync -aS -c"=0ARSYNC_ROOTFLAGS=3D=
"-H --delete --exclude-from=3D${MAINTDIR}/rsync-excludes"=0A=0ARSYNC_PASSWO=
RD=3DXXXXXXXXXXXXXXXXXXXXXX=0Aexport RSYNC_PASSWORD=0A=0A=0AETCFILES=3D"ine=
td.conf rc.conf"=0A=0Ausage=3D"Usage: $0: "'=0A	[--help] [--dont] [--keep-a=
s] [--myexcludes=3D<excludes>] <hostnames|IPs>'=0A=0Adie()=0A{=0A	echo "$0:=
 Error $@"=0A	exit 1=0A}=0A=0Awhile [ $# -gt 0 ]; do=0A	case $1 in=0A	--don=
t)		DONTDOIT=3D"echo" ;;=0A	--keep-as)	KEEP_AFTERSYNC=3D"YES" ;;=0A	--myexc=
ludes=3D*)	MYEXCLUDES=3D"`echo $1 | sed -e 's|--myexcludes=3D||'`" ;;=0A	--=
help)		echo "$usage"; exit ;;=0A	-h)		echo "$usage"; exit ;;=0A	-*)		die "$=
usage" ;;=0A	*)		HOSTS=3D"$1 ${HOSTS}" ;;=0A	esac=0A	shift=0Adone=0A=0Aif [=
 "${MYEXCLUDES}" !=3D "" ]; then=0A	for exclude in ${MYEXCLUDES}; do=0A		EX=
CLUDES=3D"--exclude ${exclude} ${EXCLUDES}"=0A	done=0Afi=0A=0Afor i in ${HO=
STS}; do=0A=0A	# first things first=0A	#=0A	echo "New host: $i" >> ${MAINTD=
IR}/logs/sync.$$.log=0A=0A	if [ -e ${MAINTDIR}/beforesync ]; then=0A		${DON=
TDOIT} ${RSYNC} ${MAINTDIR}/beforesync $i:/tmp >> 	\=0A			${MAINTDIR}/logs/=
sync.$$.log 2>&1=0A		${DONTDOIT} ${RSYNC_RSH} -n $i chmod u+x /tmp/beforesy=
nc >> \=0A			${MAINTDIR}/logs/sync.$$.log 2>&1=0A		${DONTDOIT} ${RSYNC_RSH}=
 -n $i /tmp/beforesync >>		\=0A			${MAINTDIR}/logs/sync.$$.log 2>&1=0A	fi=
=0A=0A	# otherwise we can't write anything!=0A	#=0A	${DONTDOIT} ${RSYNC_RSH=
} -n $i /sbin/mount -u -o rw /=0A	${DONTDOIT} ${RSYNC_RSH} -n $i /sbin/moun=
t -u -o rw /usr=0A	${DONTDOIT} ${RSYNC_RSH} -n $i /sbin/mount -u -o rw /var=
-ro=0A=0A	${DONTDOIT} ${RSYNC} ${RSYNC_ROOTFLAGS}				\=0A			${EXCLUDES} ${N=
ODESROOT}/ push@$i::slash >>	\=0A			${MAINTDIR}/logs/sync.$$.log 2>&1=0A=0A=
	for file in ${MAINTDIR}/etc-${i}/*; do=0A		${DONTDOIT} ${RSYNC} ${file} pu=
sh@$i::slash/etc/${file##*/} >>	\=0A			${MAINTDIR}/logs/sync.$$.log 2>&1=0A=
	done=0A=0A	# anything else?=0A	#=0A	if [ -e ${MAINTDIR}/aftersync ]; then=
=0A		${DONTDOIT} ${RSYNC} ${MAINTDIR}/aftersync push@$i::slash/tmp >> \=0A	=
		${MAINTDIR}/logs/sync.$$.log 2>&1=0A		${DONTDOIT} ${RSYNC_RSH} -n $i chmo=
d u+x /tmp/aftersync >>	\=0A			${MAINTDIR}/logs/sync.$$.log 2>&1=0A		${DONT=
DOIT} ${RSYNC_RSH} -n $i /tmp/aftersync >>		\=0A			${MAINTDIR}/logs/sync.$$=
.log 2>&1 &=0A=0A		if [ x"${KEEP_AFTERSYNC}" !=3D x"YES" ]; then=0A			if [ =
-f ${MAINTDIR}/aftersync.old ]; then=0A				${DONTDOIT} mv -f ${MAINTDIR}/af=
tersync.old \=0A					${MAINTDIR}/aftersync.really.old=0A			fi=0A			${DONTDO=
IT} mv ${MAINTDIR}/aftersync ${MAINTDIR}/aftersync.old=0A		fi=0A	fi=0A=0A	#=
 Update motd=0A	if [ -e ${MAINTDIR}/motd ]; then=0A		${DONTDOIT} ${RSYNC} $=
{MAINTDIR}/motd push@$i::slash/tmp >> \=0A			${MAINTDIR}/logs/sync.$$.log 2=
>&1=0A		${DONTDOIT} ${RSYNC_RSH} -n $i chmod u+x /tmp/motd >>	\=0A			${MAIN=
TDIR}/logs/sync.$$.log 2>&1=0A		${DONTDOIT} ${RSYNC_RSH} -n $i /tmp/motd >>=
		\=0A			${MAINTDIR}/logs/sync.$$.log 2>&1 &=0A	fi=0A=0A	# and back to read=
-only!=0A	#=0A	${DONTDOIT} ${RSYNC_RSH} -n $i /sbin/mount -u -o ro /var-ro=
=0A	${DONTDOIT} ${RSYNC_RSH} -n $i /sbin/mount -u -o ro /usr=0A	${DONTDOIT}=
 ${RSYNC_RSH} -n $i /sbin/mount -u -o ro /=0A=0A	# Finally, in case we made=
 any changes to /var-ro, synch the mfs=0A	# mounted /var=0A	${DONTDOIT} ${R=
SYNC_RSH} -n $i rsync -aS /var-ro/ /var=0A=0Adone=0A
--6c2NcOVqGQ03X4Wi
Content-Type: application/x-sh
Content-Disposition: attachment; filename="tool.sh"
Content-Transfer-Encoding: quoted-printable

#!/bin/sh=0A#=0A# loop through the available IPs and perform some of the us=
ual=0A# tasks=0A#=0A#set -x=0A=0A#defaults=0ABASEIP=3D"172.16.4."=0ASTART=
=3D2=0AEND=3D31=0ACMD=3D"/sbin/ping -c 1 -w 1 @IP@"=0AQUIET=3Dno=0ASYNCALL=
=3Dno=0AMAINTDIR=3D/usr/local/cluster-maint=0A=0Ausage=3D"Usage: $0: "'=0A	=
[--start=3D<num>] [--end=3D<num>] [--cmd=3D<cmd>]=0A	[-h|-help] [--ipbase=
=3D<x.x.x.>] [--quiet]=0A	[--syncall]'=0A=0Adie()=0A{=0A	echo "$0: Error: $=
@"=0A	exit 1=0A}=0A=0Awhile [ $# -gt 0 ]; do=0A        case $1 in=0A	--star=
t=3D*)	START=3D`echo $1 | sed -e 's|--start=3D||'` ;;=0A	--end=3D*)	END=3D`=
echo $1 | sed -e 's|--end=3D||'` ;;=0A	--cmd=3D*)	CMD=3D"`echo $1 | sed -e =
's|^--cmd=3D||'`" ;;=0A	--ipbase=3D*)	BASEIP=3D`echo $1 | sed -e 's|--ipbas=
e=3D||' | \=0A				awk --re-interval '/^((([0-9]{1,2})|((1[0-9]{2})|(2([0-4]=
[0-9]|5[0-5]))))\.){3}/ { print $1; }'`=0A			if [ -z ${BASEIP} ]; then=0A		=
		die "Not a valid base-ip of the format: " \=0A					"'[0-255].[0-255].[0-2=
55].'"=0A			fi=0A			;;=0A        --help)         echo "$usage"; exit ;;=0A =
       -h)             echo "$usage"; exit ;;=0A	--quiet)	QUIET=3Dyes ;;=0A=
	--syncall)	SYNCALL=3Dyes ;;=0A        -*)		echo "$usage"; exit 1 ;;=0A    =
    esac=0A        shift=0Adone=0A=0Aif [ "${SYNCALL}" =3D "yes" ]; then=0A=
	MACHINES=3D"`ls -d ${MAINTDIR}/etc-* | sed -e s/.*etc-*//`"=0A	i=3D0=0A	fo=
r m in ${MACHINES}; do=0A		i=3D$(($i+1))=0A		if [ $i -lt 10 ]; then=0A			${=
MAINTDIR}/bin/push.sh --keep-as $m &=0A		else=0A			${MAINTDIR}/bin/push.sh =
--keep-as $m=0A			i=3D0=0A		fi=0A	done=0Aelse=0A=0Ai=3D${START}=0Awhile [ $=
i -lt ${END} ]; do=0A        IP=3D${BASEIP}${i}=0A	CMD1=3D${CMD%@IP@*}=0A	C=
MD2=3D${CMD#*@IP@}=0A	LOCALCMD=3D"${CMD1} ${IP} ${CMD2}"=0A	if [ "${QUIET}"=
 =3D "yes" ]; then=0A		${LOCALCMD} >/dev/null=0A	else=0A		${LOCALCMD}=0A	fi=
=0A	if [ $? -gt 0 ]; then=0A		RET=3D"Failed!"=0A	else=0A		RET=3D"Success!"=
=0A	fi=0A	if [ "${QUIET}" !=3D "yes" ] || [ "${RET}" =3D "Failed!" ]; then=
=0A		echo "${LOCALCMD} : ${RET}"=0A	fi=0A        i=3D$(( $i + 1 ))=0Adone=
=0A=0Afi # syncall=0A
--6c2NcOVqGQ03X4Wi--

--Pd0ReVV5GZGQvF3a
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (NetBSD)

iD8DBQE/letdfFtkr68iakwRAhbQAKDVV5N4boI/io2ii3T5glKeZ6eGCwCgo3ww
AOmsgLrdfXA4XquI9CmLMNY=
=agUp
-----END PGP SIGNATURE-----

--Pd0ReVV5GZGQvF3a--