tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems with raidframe under NetBSD-5.1/i386



On Fri, 7 Jan 2011 15:22:03 -0800
buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:

>       hello Greg.  Regarding problem 1, the inability to
> reconstruct disks in raid sets with wedges in them, I confess I don't
> understand the vnode stuff entirely, but rf_getdisksize() in
> rf_netbsdkintf.c looks suspicious to me.  I'm a little unclear, but
> it looks like it tries to get the disk size a number of ways,
> including by checking for a possible wedge on the component.  I
> wonder if that's what's sending the reference count too high? -thanks

In rf_reconstruct.c:rf_ReconstructInPlace() we have this:

        retcode = VOP_IOCTL(vp, DIOCGPART, &dpart, FREAD,
                        curlwp->l_cred);

I think will fail for wedges... it should be doing:

        retcode = VOP_IOCTL(vp, DIOCGWEDGEINFO, &dkw, FREAD, l->l_cred);

for the wedge case (see rf_getdisksize()).  Now: since the kernel
prints:

 raid2: initiating in-place reconstruction on column 4
 raid2: Recon write failed!
 raid2: reconstruction failed.

it's somehow making it past that point... but maybe with the wrong
values?? (is there an old label on the disk or something??? )

Later...

Greg Oster

> On Jan 7,  2:17pm, Greg Oster wrote:
> } Subject: Re: Problems with raidframe under NetBSD-5.1/i386
> } On Fri, 7 Jan 2011 05:34:11 -0800
> } buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:
> } 
> } >   hello.  OK.  Still more info.There seem to be two bugs
> here: } > 
> } > 1.  Raid sets with gpt partition tables in the raid set are not
> able } > to reconstruct failed components because, for some reason,
> the failed } > component is still marked open by the system even
> after the raidframe } > code has marked it dead.  Still looking into
> the fix for that one. } 
> } Is this just with autoconfig sets, or with non-autoconfig sets too?
> } When RF marks a disk as 'dead', it only does so internally, and
> doesn't } write anything to the 'dead' disk.  It also doesn't even
> try to close } the disk (maybe it should?).  Where it does try to
> close the disk is } when you do a reconstruct-in-place -- there, it
> will close the disk } before re-opening it... 
> } 
> } rf_netbsdkintf.c:rf_close_component() should take care of closing a
> } component, but does something Special need to be done for wedges
> there? } 
> } > 2.  Raid sets with gpt partition tables on them cannot be
> } > unconfigured and reconfigured without rebooting.  This is because
> } > dkwedge_delall() is not called during the raid shutdown process.
> I } > have a patch for this issue which seems to work fine.  See the
> } > following output:
> } [snip]
> } > 
> } > Here's the patch.  Note that this is against NetBSD-5.0 sources,
> but } > it should be clean for 5.1, and, i'm guessing, -current as
> well. } 
> } Ah, good!  Thanks for your help with this.   I see Christos has
> already } commited your changes too. (Thanks, Christos!)
> } 
> } Later...
> } 
> } Greg Oster
> >-- End of excerpt from Greg Oster
> 


Later...

Greg Oster


Home | Main Index | Thread Index | Old Index