Subject: Re: misc/14094: RAID array status isn't reported in daily jobs
To: None <dave@dtsp.co.nz>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: netbsd-bugs
Date: 10/01/2001 13:23:58
On Sat, Sep 29, 2001 at 04:40:03AM -0000, dave@dtsp.co.nz wrote:
> 
> >Description:
> 	If a RAID array component has failed, but no-one is watching the
> 	console, it doesn't make a noise.  Unless you count the sound of the
> 	drive head scraping against the platter :)
> 
> 	Included patch adds support to /etc/daily to do a brief health report
> 	on all arrays defined in /etc/raid*.conf, if any.
> 
> >How-To-Repeat:
> 	Notice how quietly a RAID component can fail, wonder how long it would
> 	take to notice on a redundant array...
> 
> >Fix:
> Patch defaults to enabled checks, because with no configuration files the
> report is silent.  The report could likely be improved on, but it's a good
> start - the important thing is to see the component list and whether any have
> failed.
> 
> --- src/etc/daily.orig	Sat Sep  1 19:05:05 2001
> +++ src/etc/daily	Sat Sep 29 15:05:03 2001
> @@ -193,6 +193,19 @@
>  	fi
>  fi
>  
> +if checkyesno check_raid; then
> +	for cfg in /etc/raid[0-9].conf /etc/raid[0-9][0-9].conf; do
> +		[ ! -f "$cfg" ] && continue
> +		dev=${cfg##*/}
> +		dev=${dev%%.conf}
> +
> +		echo ""
> +		echo "RAID: Array status for ${dev}:"
> +		# Strip out the component label dump...
> +		raidctl -s "$dev" | awk -- '/^[A-Za-z]/ {show=1} /^Component label for/ {show=0} {if (show) print "  " $0; }'
> +	done
> +fi
> +

You can't rely on the /dev/raid*.conf to get the list of raid devices, because
there may autoconf raid in the system without the equivalent raid??.conf file.
I'm using this to get the list of raid device:
iostat -x | awk '/^raid/ { print $1 }'
(sorry I don't remember who posted this in the first place).
In /etc/daily I added this, which seems to do the job:
        for dev in `iostat -x | awk '/^raid/ { print $1 }'`; do
                raidctl -s $dev | awk '/^.*: failed$/ {print $0}' > $TMP
                if [ -s $TMP ]; then
                        echo "$dev:" >> $TMP2
                        cat $TMP >> $TMP2
                fi
                rm -f $TMP
        done
        if [ -s $TMP2 ]; then
                echo "failed RAID component(s):"
                cat $TMP2
        fi
        rm -f $TMP2


I also changed /etc/rc.d/raidframe to get raid device from iostat instead
of /etc/raid*.conf to start raidctl -P

--
Manuel Bouyer, LIP6, Universite Paris VI.           Manuel.Bouyer@lip6.fr
--