netbsd-users: Re: RAIDframe: notify me when a drive fails?

Subject: Re: RAIDframe: notify me when a drive fails?
To: Geert Hendrickx <ghen@netbsd.org>
From: Greg Oster <oster@cs.usask.ca>
List: netbsd-users
Date: 02/28/2006 13:46:52

Geert Hendrickx writes:
> 
> --2oS5YaxWCcQjTEyO
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> 
> On Tue, Feb 28, 2006 at 06:21:02PM +0100, Geert Hendrickx wrote:
> > It doesn't have to be a separate daemon per se.  Something I can (easily)
> > check for via a cron-job is ok.  The grep-for-failed thing works, but it
> > would be more elegant if e.g. raidctl -s would return an exit status >0 if
> > something is wrong and needs human intervention.  Maybe combined with a -q
> > flag (for no output), it would get as easy as 
> > 
> >   raidctl -s -q raid0 || mail -s "RAID problem" ...
> > 
> > in an hourly cronjob.  

So how about just:

 raidctl -s raid0 | grep failed && mail -s "RAID problem" ...

?  (Works fine in the testing I just did :) )

> This simple patch (against 3.0) makes "raidctl -s" return the number of
> failed components and/or spares in the array (i.e., normally 0).  

I really don't care for mixing of error codes and "number of disks that 
have failed" like that... :(  (never mind that it wouldn't let 
raidctl return real error codes for other legitimate reasons at some
later date)

Later...

Greg Oster