tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems with raidframe under NetBSD-5.1/i386



        Hello Greg.  Both problems apply to auto-configured raid sets as well
as ones configured manually. Now that you've told me where to look, I have
an idea of the problem.  There's a note in dev/dkwedge/dk.c which reminds
the reader that you  can open a wedge as many times as you want, but only
the last close will cause the wedge to actually be closed.  In general,
this is always true in the kernel for all devices, but I think it's what's
biting us here.  If you unconfigure a raid set, the component device
becomes available for reading and writing again.  That's good, because it
tells us that the problem is in raidframe itself, which reduces the scope
of the search.  Other than that, I'm not sure what's going wrong, other
than to theorize that one of the components' "wedge label locations"
coincides with the wedge label in the raid set itself, causing the
component to have a wedge silently opened during the configuring process.
I'm pretty sure it's not an auto-configuring problem because if you
unconfigure the raid set, and reconfigure it manually, the problem recurrs.
        I'll keep reading and maybe it will come to me.  Hopefully it's as
easy a fix as the fix for kern/44340 was.

-Brian
On Jan 7,  2:17pm, Greg Oster wrote:
} Subject: Re: Problems with raidframe under NetBSD-5.1/i386
} On Fri, 7 Jan 2011 05:34:11 -0800
} buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:
} 
} >     hello.  OK.  Still more info.There seem to be two bugs here:
} > 
} > 1.  Raid sets with gpt partition tables in the raid set are not able
} > to reconstruct failed components because, for some reason, the failed
} > component is still marked open by the system even after the raidframe
} > code has marked it dead.  Still looking into the fix for that one.
} 
} Is this just with autoconfig sets, or with non-autoconfig sets too?
} When RF marks a disk as 'dead', it only does so internally, and doesn't
} write anything to the 'dead' disk.  It also doesn't even try to close
} the disk (maybe it should?).  Where it does try to close the disk is
} when you do a reconstruct-in-place -- there, it will close the disk
} before re-opening it... 
} 
} rf_netbsdkintf.c:rf_close_component() should take care of closing a
} component, but does something Special need to be done for wedges there?
} 
} > 2.  Raid sets with gpt partition tables on them cannot be
} > unconfigured and reconfigured without rebooting.  This is because
} > dkwedge_delall() is not called during the raid shutdown process.  I
} > have a patch for this issue which seems to work fine.  See the
} > following output:
} [snip]
} > 
} > Here's the patch.  Note that this is against NetBSD-5.0 sources, but
} > it should be clean for 5.1, and, i'm guessing, -current as well.
} 
} Ah, good!  Thanks for your help with this.   I see Christos has already
} commited your changes too. (Thanks, Christos!)
} 
} Later...
} 
} Greg Oster
>-- End of excerpt from Greg Oster




Home | Main Index | Thread Index | Old Index