tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Strange problem with raidframe under NetBSD-5.1
hello Greg. I just updated to the latest 5.1 tree but I don't see the
change you note in that update. I see the commit in the cvs logs, but it
doesn't look like it made it into the NetBSD-5 branch. The latest version
I see, even after combing through the source-changes archives on the
www.netbsd.org site is ...2.44.8 which was a fix for a bug I reported with
wedges and raidframe some time ago. I could be missing something, and I
probably am, but it's not obvious to me. Could you look to see if you
see it on the NetBSD-5 branch?
-thanks
-Brian
On Jun 12, 3:30pm, Brian Buhrow wrote:
} Subject: Re: Strange problem with raidframe under NetBSD-5.1
} Hello. That appears to be the problem. I thought I updated my 5.1
} sources, but I've been doing so much patching, testing and patching with
} respect to the ffs fixes, that I guess I didn't actually get the latest
} sources. doing that now. I think/hope that will fix me up.
}
} -thanks
} -Brian
} On Jun 12, 4:14pm, Greg Oster wrote:
} } Subject: Re: Strange problem with raidframe under NetBSD-5.1
} } On Tue, 12 Jun 2012 14:44:55 -0700
} } buhrow%lothlorien.nfbcal.org@localhost (Brian Buhrow) wrote:
} }
} } > Hello. I've just encountered a strange problem with
} } > raidframe under NetBSD-5.1 that I can't immediately explain.
} } >
} } > this machine has been runing a raid set since 2007. The raid
} } > set was originally constructed under NetBSD-3. For the past year,
} } > it's been running 5.0_stable with sources from
} } > July 2009 or so without a problem. Last night, I installed
} } > NetBSD-5.1 with sources from May 23 2012 or so. Now, the raid0 set
} } > fails the first component with an i/o error with no corresponding
} } > disk errors underneath. Trying to reconstruct to the failed
} } > component also fails with an error of 22, invalid argument. Looking
} } > at the dmesg output compared with the output of raidctl -s reveals
} } > the problem. The size of the raid in the dmesg output is bogus, and,
} } > if the raid driver dries to write as many blocks as is reported by
} } > the configuration output, it will surely fail as it does. However,
} } > raidctl -g /dev/wd0a looks ok and the underlying disk label
} } > on /dev/wd0a looks ok as well. Where does the raid driver get the
} } > numbers it reports on bootup? Also, there is a second raid set on
} } > this machine, the second half of the same two drives, which was
} } > constructed at the same time. It works fine with the new code.
} } >
} } > Below is the output of the boot sequence before the upgrade,
} } > and then the boot sequence after the upgrade. Below that are the
} } > output of raidctl -s raid0 and raidctl -g /dev/wd0a raid0.
} } > It looks to me like something is not zero'd out in the
} } > component label that should be, but some change in the raid code is
} } > no longer ignoring the noise in the component label.
} }
} } Correct.
} }
} } > Any ideas?
} }
} } There was some code added a while back to handle components whose sizes
} } were larger than 32-bit. But 5.1_stable should have the code to handle
} } those 'bogus' values in the component label and do the appropriate
} } thing (see rf_fix_old_label_size in rf_netbsdkintf.c version
} } 1.250.4.11, for example).
} }
} } What is your code rev for src/sys/dev/raidframe/rf_netbsdkintf.c ?
} }
} } Later...
} }
} } Greg Oster
} >-- End of excerpt from Greg Oster
}
}
>-- End of excerpt from Brian Buhrow
Home |
Main Index |
Thread Index |
Old Index