Subject: Re: Data corruption issues possibly involving cgd(4)
To: Daniel Carosone <dan@geek.com.au>
From: Nino Dehne <ndehne@gmail.com>
List: current-users
Date: 01/16/2007 08:00:14
On Tue, Jan 16, 2007 at 05:24:12PM +1100, Daniel Carosone wrote:
> any chance you could test with a RAID5 - ideally from the same RAID5 -
> without cgd?  It could be a controller or drive problem, or even a
> power supply problem when all drives are active.  RAID1 won't
> necessarily hit those conditions, especially for read.

I'm getting your drift. While I can't make a filesystem on the RAID5
directly, see below.


> You could probably achive the same result dd'ing a constant chunk of
> encrypted data off the raid(4) device to checksum, avoiding the need
> to destroy or remake filesystems.  If you reproduce the problem like
> this, you have also eliminated filesystem bugs.  

Excellent advice, thanks. Unfortunately, I can't reproduce the issue this
way.

After 50 runs of dd if=/dev/rcgd0d bs=65536 count=4096 | md5 and no error
I aborted the test. Replacing rcgd0d with cgd0a made no difference.
While not necessary IMO, I tried the same with rraid1d, no errors either
after 50 runs. For comparison, a loop on the filesystem on the cgd aborted
after the 14th run now.

So the issue doesn't seem to be related to the power supply either and
frankly, it's starting to freak me out.


> > Please help, I'm at a loss.
> 
> It's a tricky one, but the above would be my next guess, and the next
> useful thing to try to eliminate.

So there, I'm even more at a loss now. :)

Thanks for the help. Best regards,

ND