tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems with raidframe under NetBSD-5.1/i386



On Fri, 18 Feb 2011 13:09:18 -0800
buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:

>       Hello.  It's been a while since I had an opportunity to work
> on this problem.  However, I have figured out the trouble.  While the
> error is mine, I do have a couple of questions as to why I didn't
> discover it sooner.
>       It turns out that I had fat fingered the disktab entry I used
> to disklabel the component disks such that the start of the raid
> partition was at offset 0 relative to the entire disk, rather than
> offset 63, which is what I normally use to work around PC BIOS
> routines and the like.  Once I figured that out, the error I was
> getting made sense. With this in mind, my question and suggestion are
> as follows:
> 
> 1.  It makes sense to me that I would get an EROFS error if I try to
> reconstruct to a protected portion of the component disk.  What
> doesn't make sense to me is why I could create the working raid set
> in the first place?  Why didn't I run into this error when writing
> the initial component labels?  Another symptom of this issue,
> although I didn't know about it at the time, is that components  of
> my newly created raid sets would fail with an i/o failure, without
> any apparent whining from the component disk itself. I think now that
> this was because the raid driver was trying to update some portion of
> the component label and failing in the same way. Ok, my bad for
> getting my offsets wrong in the disklabel for the component disks,
> but can't we make it so this fails immediatly upon raid creation
> rather than having the trouble exhibit itself as apparently
> unexplained component disk failures?

I really don't get why the creation of the raid set would have
succeeded before, but not afterwards.... Was the RAID set created in
single-user mode or from sysinst or something?  Is there some
'securelevel' thing coming into play?  I'm just guessing here, as this
makes no sense to me :(  (The thing is: RAIDframe shouldn't be touching
any of those 'protected' areas of the disk anyway... the first 64
blocks are reserved, with the component label and such being at the
half-way point.  So even if you used an offset of 0, it would have only
been looking to touch blocks 32 and 33 (for parity logging).... so
unless something is protecting all of the first 63 blocks it shouldn't
be complaining :( )

> 2. I'd like to suggest the following quick patch to the raid driver
> to help make the diagnosis of component failures easier.    Thoughts?

The patch looks fine, and quite useful.  Please commit.

Later...

Greg Oster


Home | Main Index | Thread Index | Old Index