tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Problems with raidframe under NetBSD-5.1/i386
Hello. It's been a while since I had an opportunity to work on this
problem. However, I have figured out the trouble. While the error is
mine, I do have a couple of questions as to why I didn't discover it
sooner.
It turns out that I had fat fingered the disktab entry I used to
disklabel the component disks such that the start of the raid partition was
at offset 0 relative to the entire disk, rather than offset 63, which is
what I normally use to work around PC BIOS routines and the like. Once I
figured that out, the error I was getting made sense.
With this in mind, my question and suggestion are as follows:
1. It makes sense to me that I would get an EROFS error if I try to
reconstruct to a protected portion of the component disk. What doesn't
make sense to me is why I could create the working raid set in the first
place? Why didn't I run into this error when writing the initial component
labels? Another symptom of this issue, although I didn't know about it at
the time, is that components of my newly created raid sets would fail with
an i/o failure, without any apparent whining from the component disk
itself. I think now that this was because the raid driver was trying to
update some portion of the component label and failing in the same way.
Ok, my bad for getting my offsets wrong in the disklabel for the component
disks, but can't we make it so this fails immediatly upon raid creation
rather than having the trouble exhibit itself as apparently unexplained
component disk failures?
2. I'd like to suggest the following quick patch to the raid driver to help
make the diagnosis of component failures easier. Thoughts?
-thanks
-Brian
--- rf_reconstruct.c 2011-01-04 15:32:20.000000000 -0800
+++ /var/tmp/rf_reconstruct.c 2011-01-20 16:36:14.000000000 -0800
@@ -1500,7 +1504,7 @@
Dprintf2("Reconstruction completed on psid %ld ru %d\n",
rbuf->parityStripeID, rbuf->which_ru);
if (status) {
- printf("raid%d: Recon write failed!\n", rbuf->raidPtr->raidid);
+ printf("raid%d: Recon write failed (status %d(0x%x)!\n",
rbuf->raidPtr->raidid,status,status);
rf_CauseReconEvent(rbuf->raidPtr, rbuf->col, arg,
RF_REVENT_WRITE_FAILED);
return(0);
}
On Jan 21, 8:03am, Greg Oster wrote:
} Subject: Re: Problems with raidframe under NetBSD-5.1/i386
} On Thu, 20 Jan 2011 17:28:21 -0800
} buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:
}
} > hello. I got side tracked from this problem for a while, but
} > I'm back to looking at it as I have time.
} > I think I may have been barking up the wrong tree with
} > respect to the problem I'm having reconstructing to raidframe disks
} > with wedges on the raid sets. Putting in a little extra info in the
} > error messages yields: raid2: initiating in-place reconstruction on
} > column 4 raid2: Recon write failed (status 30(0x1e)!
} > raid2: reconstruction failed.
} >
} > If that status number, taken from the second argument of
} > rf_ReconeWriteDoneProc() is an error from /usr/include/sys/errno.h,
} > then I'm getting EROFS when I try to reconstruct the disk.
}
} Hmmm... strange...
}
} > Wouldn't
} > that seem to imply that raidframe is trying to write over some
} > protected portion of one of the components, probably the one I can't
} > reconstruct to? Each of the components has a BSD disklabel on it, and
} > I know that the raid set actually begins 64 sectors from the start of
} > the partition in which the raid set resides. However, is a similar
} > "back set" done for the end of the raid? That is, does the raid set
} > extend all the way to the end of its partition or does it leave some
} > space at the end for data as well?
}
} No, it doesn't. The RAID set will use the remainder of the component,
} but up to a multiple of whatever the stripe width is... (that is, the
} RAID set will always end on a complete stripe.)
}
} > Here's the thought. I notice when
} > I was reading through the wedge code, that there's a reference to
} > searching for backup gpt tables and that one of the backups is stored
} > at the end of the media passed to the wedge discovery code. Since
} > the broken component is the last component in the raid set, I wonder
} > if the wedge discovery code is marking the sectors containing the gpt
} > table at the end of the raid set as protected, but for the disk
} > itself, rather than the raid set? I want to say that this is only a
} > theory at the moment, based on a quick diagnostic enhancement to the
} > error messages, but I can't think of another reason why I'd be
} > getting that error. I'm going to be in and out of the office over the
} > next week, but I'll try to see if I can capture the block numbers
} > that are attempting to be written when the error occurs. I think I
} > can do that with a debug kernel I have built for the purpose. Again,
} > this problem exists under 5.0, not just 5.1, so it predates Jed's
} > changes. If anyone has any other thoughts as to why I'd be getting
} > EROFS on a raid component when trying to reconstruct to it, but not
} > when I create the raid, I'm all ears.
}
} So when one builds a regular filesystem on a wedge, do they end up with
} the same problem with 'data' at the end of the wedge? If one does a dd
} to the wedge, does it report write errors before the end of the wedge?
}
} I really need to get my test box up-to-speed again, but that's going to
} have to wait a few more weeks...
}
} Later...
}
} Greg Oster
}
}
} > On Jan 7, 3:22pm, Brian Buhrow wrote:
} > } Subject: Re: Problems with raidframe under NetBSD-5.1/i386
} > } hello Greg. Regarding problem 1, the inability to
} > reconstruct disks } in raid sets with wedges in them, I confess I
} > don't understand the vnode } stuff entirely, but rf_getdisksize() in
} > rf_netbsdkintf.c looks suspicious } to me. I'm a little unclear, but
} > it looks like it tries to get the disk } size a number of ways,
} > including by checking for a possible wedge on the } component. I
} > wonder if that's what's sending the reference count too high? }
} > -thanks } -Brian
} > }
} > } On Jan 7, 2:17pm, Greg Oster wrote:
} > } } Subject: Re: Problems with raidframe under NetBSD-5.1/i386
} > } } On Fri, 7 Jan 2011 05:34:11 -0800
} > } } buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:
} > } }
} > } } > hello. OK. Still more info.There seem to be two bugs
} > here: } } >
} > } } > 1. Raid sets with gpt partition tables in the raid set are not
} > able } } > to reconstruct failed components because, for some reason,
} > the failed } } > component is still marked open by the system even
} > after the raidframe } } > code has marked it dead. Still looking
} > into the fix for that one. } }
} > } } Is this just with autoconfig sets, or with non-autoconfig sets
} > too? } } When RF marks a disk as 'dead', it only does so internally,
} > and doesn't } } write anything to the 'dead' disk. It also doesn't
} > even try to close } } the disk (maybe it should?). Where it does try
} > to close the disk is } } when you do a reconstruct-in-place -- there,
} > it will close the disk } } before re-opening it...
} > } }
} > } } rf_netbsdkintf.c:rf_close_component() should take care of closing
} > a } } component, but does something Special need to be done for
} > wedges there? } }
} > } } > 2. Raid sets with gpt partition tables on them cannot be
} > } } > unconfigured and reconfigured without rebooting. This is
} > because } } > dkwedge_delall() is not called during the raid shutdown
} > process. I } } > have a patch for this issue which seems to work
} > fine. See the } } > following output:
} > } } [snip]
} > } } >
} > } } > Here's the patch. Note that this is against NetBSD-5.0
} > sources, but } } > it should be clean for 5.1, and, i'm guessing,
} > -current as well. } }
} > } } Ah, good! Thanks for your help with this. I see Christos has
} > already } } commited your changes too. (Thanks, Christos!)
} > } }
} > } } Later...
} > } }
} > } } Greg Oster
} > } >-- End of excerpt from Greg Oster
} > }
} > }
} > >-- End of excerpt from Brian Buhrow
} >
}
}
} Later...
}
} Greg Oster
>-- End of excerpt from Greg Oster
Home |
Main Index |
Thread Index |
Old Index