Subject: Re: RAIDframe crash again
To: Kazushi Marukawa (Jam) <jam@pobox.com>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 07/12/2001 20:01:44
Kazushi Marukawa writes:
> Hi,
>
> My system is crashed and the situation is similar to Chris
> Jones one. FYI, the message-id of his mail is
> <20010508165041.C6074@mt.sri.com>.
>
> The real reason is two hard drives failure in a 4 drives
> RAID5 system. Then, system was crashed. Is there any way
> to stop this crash?
No. (I had a look the other day at trying to make it not panic on a
2-component failure, but didn't get very far :( )
> A copy of messages is below. This is
> not all, I just grepped it by "raid" keyword.
[snip]
The more interesting bits will be from *before* the first "raid0: IO Error."
In particular, you need to find out *why* it said wd3e and wd1e failed.
> Here is a trace after the crash. I hope this help some
> developper to fix this.
Thanks for the trace, but the panic on a 2-drive failure is intentional
(or at least is/was to the original RAIDframe writers)
>
> Both hard drives that raid marked failure are OK with
> manufacture's test program. Maybe, those are going bad now,
> but it works for now.
Could be cabling/heat/power issues too. How long have you been running this
RAID set?
> So, I connected only 3 out of 4
> drives and start using them to make a backup. I configured
> raid5 with -C and did fsck. FSCK asked me to remove some
> files to fix file system. I copied those files with a hope
> that only inode is corrupted but data is correct. After
> fsck, I copied those files into the original place. System
> crashed again. Sigh. However, after that, I mean
> restarting the system and fsck -p,
Just use 'fsck' (without the -p). And do it a few times until you get
no more changes/errors.
> I could copy those files
> into the original place. Here is a trace after this crash.
Hmmm... Did you get a copy of the panic message? Hard to tell exactly
why it died here...
Could you ship me (privately) a copy of your raid config files and of
/var/run/dmesg.boot? Thanks.
Later...
Greg Oster