Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: raidframe fun



On Fri, Aug 16, 2013 at 03:57:23PM -0600, Greg Oster wrote:
> On Fri, 16 Aug 2013 22:30:58 +0100
> Patrick Welche <prlw1%cam.ac.uk@localhost> wrote:
> 
> > On Fri, Aug 16, 2013 at 03:22:28PM -0600, Greg Oster wrote:
> > > > 
> > > > They didn't quite find each other again...
> > > 
> > > Right.. so the question is why?  This is not supposed to happen (and
> > > the only way I've ever seen it happen before is if a User gets
> > > playing tricks with the raid devices and attempts to re-combine
> > > components after they've been configured to different RAID sets...)
> > 
> > So strictly those 2 disks were the only 2 disks in a functioning
> > NetBSD box, they were then added to another box which also had a
> > raid array, at which point I had kern/38241 _kernel_lock: spinout
> > fun, and then loads of reboots failing to get kgdb to work.  It is
> > only just now after a boot -1 that I actually got past the kernel
> > panic, and this is what I saw. I haven't been in a position to play
> > tricks yet... There is an outside chance that they may have met
> > seatools, but I think that those were other disks...
> > 
> > BTW I don't think the data on those disks is worth worrying about,
> > but I'm happy to do a bit of debugging with you if you think there
> > is something to be fixed...
> 
> I don't think seatools would have done anything... what likely happened
> was that the 'other box' with a RAID array likely had a raid0... that
> would mean that the 'newly introduced disks', which also likely had
> been raid0 at last configuration, would have auto-configured to raid9
> (or whatever the highest raid available might have been).  If only one
> of them got configured, and somehow the label got written back out with
> the 'new raid id', then the raid ids would get out-of-sync and that
> would make raidframe think that they were from different raid sets.
> You probably would have been ok had both of them gotten re-configured,
> but it's my guess that only one got done before the panic, at which
> point things are badly messed up.

It just occurred to me that both autoconfigured root raid sets had
to have been correctly identified for raidframe to RB_ASKNAME to
ask which should be booted from to get to the "boot -a" condition
for the panic. So yes, first time around all must have been fine,
then kern/38241 caused trouble...

Thanks!

Patrick


Home | Main Index | Thread Index | Old Index