NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RAID reconstruction hangs the whole system



On Fri, 6 Dec 2013 01:54:50 +0200
Jarmo Jaakkola <netbsd-users%roskakori.fi@localhost> wrote:

> On Thu, Dec 05, 2013 at 04:32:55PM -0600, Greg Oster wrote:
> > On Thu, 5 Dec 2013 23:30:06 +0200
> > Jarmo Jaakkola <netbsd-users%roskakori.fi@localhost> wrote:
> > > On Thu, Dec 05, 2013 at 02:04:09PM -0600, Greg Oster wrote:
> > > > On Thu, 5 Dec 2013 21:37:53 +0200
> > > > Jarmo Jaakkola <netbsd-users%roskakori.fi@localhost> wrote:
> > If you check your disklabels and such, I think what you'll see is
> > the size of your RAID 1 set is actually about 2x what it should
> > be.... 
> 
> Actually it is the size I was expecting: size of single component -
> 2*64 sectors.  I hope I would have noticed if the size would have been
> larger than I expected ;D
> 
>     # for i in 1 3 4; do disklabel wd${i} | grep ' a:'; done
>      a: 131072 2048 RAID
>      a: 131072 2048 RAID
>      a: 131072 2048 RAID
>     # disklabel raid0 | grep 'total sectors'
>     total sectors: 130944

Hmm..... that's not what I'd have expected....

> > > The RAID set configured and seems to work just fine except for
> > > the  reconstruction problems.
> > 
> > I'm surprised it did, as it's technically missing the '4th
> > component' that it would need to work properly....   Essentially
> > your wd4a is not mirrored anywhere, and doesn't have anywhere to
> > rebuild to -- and it might be this later fact that is causing
> > things to hiccup when you try to rebuild.....
> 
> I wonder if this also caused the other problem I remember having: I
> was not able to remove a hot spare once it was added.  I.e. "raidctl
> -r" did nothing.  I'll try to confirm that one way or another when I
> start fixing this muck up.

'raidctl -r' doing nothing is documented in the BUGS section of 
'man raidctl'. :-/  It's not related to this.

> > Havn't looked at all the relevant code, but at least
> > the RAID 1 config bits in the kernel don't seem to check to make
> > sure there are an even number of components provided, and I'm
> > betting raidctl doesn't do so either :(  (So ya it 'worked', but
> > wasn't really providing the RAID 1 like you were expecting :(  At a
> > minimum I should have added more error checking to enforce the even
> > number of components requirement...
> 
> I'll change to using only two components then.  I was going to submit
> a PR for this as a TODO for you (I've gotten the idea that you work on
> RAIDframe).  I found that this issue has been reported before as
> #45162:
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=45162 I'd
> say the severity could be raised from "non-critical". :)

Ah!  I apparently neglected to take ownership of that PR, and have
since forgotten about it!

> Thank you very much for your help!

No problem.

Later...

Greg Oster


Home | Main Index | Thread Index | Old Index