Subject: Re: proposed changes to RAIDframe related rc.d scripts
To: None <current-users@netbsd.org>
From: Jukka Salmi <j+nbsd@2007.salmi.ch>
List: current-users
Date: 06/30/2007 15:02:38
--6c2NcOVqGQ03X4Wi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Jukka Salmi --> current-users (2005-04-22 19:14:21 +0200):
> Hi,
> 
> on a NetBSD system using RAIDframe and default rc.d scripts, parity
> checking the RAID set is done late in the boot process:
> 
> 	$ rcorder -s nostart * | tail
> 	postfix
> 	smmsp
> 	raidframeparity
> 	poffd
> 	ndbootd
> 	moused
> 	mixerctl
> 	inetd
> 	identd
> 	cron
> 
> Furthermore it's done in the background: until it succeeds the data is
> not protected against a component failure.
> 
> Parity checking was split out of /etc/rc.d/raidframe (which is executed
> early) about 2.5 years ago to not "end up with fsck and raidframe parity
> rebuild taking forever after a crash/reboot".
> 
> I'd appreciate the possibility to choose whether parity checking should
> be done as early as possible (i.e. before fsck runs and filesystems are
> mounted) and in the foreground, or as it is right now (late and in the
> background).
> 
> The attached patch adds this functionality: setting raidframeparity_early
> to NO (the default) doesn't change any behaviour; setting it to YES runs
> the parity check right after non-auto-configured RAID sets are configured
> and in the foreground. (Hmm, should it terminate the boot process if it
> fails, like /etc/rc.d/fsck?)

After I just had to wait for several hours for a huge raid set's parity
to be rebuilt on a system with those proposed changes applied, I made
some changes to the changes:

In addition to `raidframeparity_early' (which is `YES' or `NO' (default)),
there are now two user settable variables, `raidframeparity_early_disks'
and `raidframeparity_late_disks', to specify which RAID sets should be
checked early and in the foreground and which ones should be check late
and in the background. The eight possible combinations of these variables
result in the following actions:

rfp_early  rfp_early_disks  rfp_late_disks  checked / rewritten when?
-----------------------------------------------------------------------
NO         ""               ""              all disks late (default)
NO         ""               $disks2         $disks2 late
NO         $disks1          ""              $disks1 early
NO         $disks1          $disks2         $disks1 early, $disks2 late
YES        ""               ""              all disks early
YES        ""               $disks2         all disks early
YES        $disks1          ""              all disks early
YES        $disks1          $disks2         all disks early

s/rfp_/raidframeparity_/g

In addition, if raidctl(8) rails to rewrite partity for one of the
RAID sets to be checked early, then the boot is aborted now.


Comments are welcome!

Cheers, Jukka

-- 
bashian roulette:
$ ((RANDOM%6)) || rm -rf ~

--6c2NcOVqGQ03X4Wi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="raidframeparity_early.patch"

NetBSD PR misc/30074

http://mail-index.netbsd.org/current-users/2005/04/22/0017.html

Index: share/man/man5/rc.conf.5
===================================================================
RCS file: /cvsroot/src/share/man/man5/rc.conf.5,v
retrieving revision 1.114
diff -u -p -r1.114 rc.conf.5
--- share/man/man5/rc.conf.5	15 May 2007 19:47:43 -0000	1.114
+++ share/man/man5/rc.conf.5	30 Jun 2007 12:02:31 -0000
@@ -253,6 +253,42 @@ If not set to
 .Sq YES ,
 and no swap devices
 are configured, the system will warn you.
+.It Sy raidframeparity_early
+.Sq YES
+or
+.Sq NO .
+Check and possibly rewrite parity on all RAIDframe devices early in the
+boot process (before
+.Xr fsck 8
+is run) and abort the boot if rewriting fails.
+.Bl -hang
+.It Em Note :
+Data on a RAID set is not protected against a component failure until
+parity is up-to-date!
+.El
+.It Sy raidframeparity_early_disks
+A space separated list of RAIDframe devices.
+Parity on these devices will be checked and possibly be rewritten early
+in the boot process as described under
+.Sy raidframeparity_early .
+Only applicable if
+.Sy raidframeparity_early
+is
+.Sq NO
+(default); otherwise all available RAIDframe devices are checked early,
+not only those listed here.
+.It Sy raidframeparity_late_disks
+A space separated list of RAIDframe devices.
+Parity on these devices will be checked and possibly be rewritten late
+in the boot process, continuing the boot after the checks have been
+initiated.
+This is the default behaviour.
+Only applicable if
+.Sy raidframeparity_early
+is
+.Sq NO
+(default); otherwise all available RAIDframe devices are checked early,
+even those listed here.
 .It Sy swapoff
 .Sq YES
 or
Index: etc/defaults/rc.conf
===================================================================
RCS file: /cvsroot/src/etc/defaults/rc.conf,v
retrieving revision 1.86
diff -u -p -r1.86 rc.conf
--- etc/defaults/rc.conf	15 May 2007 19:47:48 -0000	1.86
+++ etc/defaults/rc.conf	30 Jun 2007 12:02:31 -0000
@@ -92,6 +92,13 @@ ccd=YES
 #
 raidframe=YES
 
+# When and how to check and possibly rewrite RAIDframe parity.
+#
+raidframeparity_early=NO	# Set to YES to check parity on all RAIDframe
+				# devices early in the boot process and not to
+				# continue until it completes successfully.
+				# For more fine-grained options see rc.conf(5).
+
 # Crypto file system.
 #
 cgd=YES
Index: etc/rc.d/raidframe
===================================================================
RCS file: /cvsroot/src/etc/rc.d/raidframe,v
retrieving revision 1.9
diff -u -p -r1.9 raidframe
--- etc/rc.d/raidframe	13 Aug 2004 18:08:03 -0000	1.9
+++ etc/rc.d/raidframe	30 Jun 2007 12:02:31 -0000
@@ -4,6 +4,7 @@
 #
 
 # PROVIDE: disks
+# BEFORE: raidframeparity_early
 
 $_rc_subr_loaded . /etc/rc.subr
 
Index: etc/rc.d/raidframeparity
===================================================================
RCS file: /cvsroot/src/etc/rc.d/raidframeparity,v
retrieving revision 1.3
diff -u -p -r1.3 raidframeparity
--- etc/rc.d/raidframeparity	11 Oct 2004 15:00:51 -0000	1.3
+++ etc/rc.d/raidframeparity	30 Jun 2007 12:02:31 -0000
@@ -9,14 +9,28 @@ $_rc_subr_loaded . /etc/rc.subr
 
 name="raidframeparity"
 start_cmd="raidframeparity_start"
+start_precmd="raidframeparity_prestart"
 stop_cmd=":"
 
+raidframeparity_prestart()
+{
+	checkyesno raidframeparity_early && return 1
+
+	if [ -z "$raidframeparity_late_disks" ]; then
+		if [ -z "$raidframeparity_early_disks" ]; then
+			raidframeparity_late_disks=$(sysctl -n hw.disknames)
+		else
+			return 1
+		fi
+	fi
+}
+
 raidframeparity_start()
 {
 	# Initiate parity/mirror reconstruction as needed, in the background.
 	#
 	(
-		for dev in $(sysctl -n hw.disknames); do
+		for dev in $raidframeparity_late_disks; do
 			case $dev in
 			raid[0-9]*)
 				raidctl -P $dev
--- /dev/null	2007-06-30 13:49:58.000000000 +0200
+++ etc/rc.d/raidframeparity_early	2007-06-30 13:34:53.000000000 +0200
@@ -0,0 +1,41 @@
+#!/bin/sh
+#
+# $NetBSD$
+#
+
+# PROVIDE: disks raidframeparity_early
+
+$_rc_subr_loaded . /etc/rc.subr
+
+name="raidframeparity_early"
+start_cmd="raidframeparity_early_start"
+start_precmd="raidframeparity_early_prestart"
+stop_cmd=":"
+
+raidframeparity_early_prestart()
+{
+	if checkyesno raidframeparity_early; then
+		raidframeparity_early_disks=$(sysctl -n hw.disknames)
+	fi
+
+	if [ -z "$raidframeparity_early_disks" ]; then
+		return 1
+	fi
+}
+
+raidframeparity_early_start()
+{
+	# Initiate parity/mirror reconstruction as needed,
+	# aborting boot if this fails.
+	#
+	for dev in $raidframeparity_early_disks; do
+		case $dev in
+		raid[0-9]*)
+			raidctl -v -P $dev || stop_boot
+			;;
+		esac
+	done
+}
+
+load_rc_config $name
+run_rc_command "$1"
Index: etc/rc.d/Makefile
===================================================================
RCS file: /cvsroot/src/etc/rc.d/Makefile,v
retrieving revision 1.64
diff -u -p -r1.64 Makefile
--- etc/rc.d/Makefile	20 Feb 2007 21:29:08 -0000	1.64
+++ etc/rc.d/Makefile	30 Jun 2007 12:02:31 -0000
@@ -24,7 +24,7 @@ CONFIGFILES=\
 		named ndbootd network newsyslog nfsd nfslocking ntpd ntpdate \
 		perusertmp pf pf_boot pflogd poffd postfix powerd ppp pwcheck \
 		quota \
-		racoon rpcbind raidframe raidframeparity rarpd rbootd root \
+		racoon rpcbind raidframe raidframeparity raidframeparity_early rarpd rbootd root \
 		route6d routed rtadvd rtclocaltime rtsold rwho \
 		savecore screenblank sdpd securelevel sshd \
 		staticroute swap1 swap2 sysctl sysdb syslogd \
Index: etc/mtree/special
===================================================================
RCS file: /cvsroot/src/etc/mtree/special,v
retrieving revision 1.111
diff -u -p -r1.111 special
--- etc/mtree/special	10 May 2007 17:45:50 -0000	1.111
+++ etc/mtree/special	30 Jun 2007 12:02:31 -0000
@@ -235,6 +235,7 @@
 ./etc/rc.d/racoon		type=file mode=0555
 ./etc/rc.d/raidframe		type=file mode=0555
 ./etc/rc.d/raidframeparity	type=file mode=0555
+./etc/rc.d/raidframeparity_early type=file mode=0555
 ./etc/rc.d/rarpd		type=file mode=0555
 ./etc/rc.d/rbootd		type=file mode=0555
 ./etc/rc.d/root			type=file mode=0555
Index: distrib/sets/lists/etc/mi
===================================================================
RCS file: /cvsroot/src/distrib/sets/lists/etc/mi,v
retrieving revision 1.190
diff -u -p -r1.190 mi
--- distrib/sets/lists/etc/mi	8 Jun 2007 22:24:07 -0000	1.190
+++ distrib/sets/lists/etc/mi	30 Jun 2007 12:02:31 -0000
@@ -223,6 +223,7 @@
 ./etc/rc.d/racoon				etc-net-rc
 ./etc/rc.d/raidframe				etc-sys-rc
 ./etc/rc.d/raidframeparity			etc-sys-rc
+./etc/rc.d/raidframeparity_early		etc-sys-rc
 ./etc/rc.d/rarpd				etc-bootserver-rc
 ./etc/rc.d/rbootd				etc-bootserver-rc
 ./etc/rc.d/root					etc-sys-rc
Index: usr.sbin/postinstall/postinstall
===================================================================
RCS file: /cvsroot/src/usr.sbin/postinstall/postinstall,v
retrieving revision 1.42
diff -u -p -r1.42 postinstall
--- usr.sbin/postinstall/postinstall	8 Jun 2007 22:24:08 -0000	1.42
+++ usr.sbin/postinstall/postinstall	30 Jun 2007 12:02:32 -0000
@@ -858,7 +858,7 @@ do_rc()
 		named ndbootd network newsyslog nfsd nfslocking ntpd ntpdate \
 		perusertmp pf pf_boot pflogd poffd postfix powerd ppp pwcheck \
 		quota \
-		racoon rpcbind raidframe raidframeparity rarpd rbootd root \
+		racoon rpcbind raidframe raidframeparity raidframeparity_early rarpd rbootd root \
 		route6d routed rtadvd rtclocaltime rtsold rwho \
 		savecore screenblank sdpd securelevel sshd \
 		staticroute swap1 swap2 sysctl sysdb syslogd \

--6c2NcOVqGQ03X4Wi--