Subject: Re: Data Corruption
To: None <bmcewen@comcast.net>
From: Daniel Cox <dc@microbits.com.au>
List: netbsd-users
Date: 09/27/2005 15:31:03
Thankyou for your response.

I have had a good look through all of the archives searching on corrupt/crc=
 error.

Most refer to NIC issues or a corrupt filesystem.
fsck always returns without error (and I have been recreating the =
filesystem before each test).

I found a number under cobalt in late 2004 but couldn't find the solution. =
It seems these problems were before 2.0 release so any fix would presumably=
 be in 2.0.2 which I am running now.

I will try -Current on the original hardware and see if that fixes the =
problem.


>>> Brian McEwen <bmcewen@comcast.net> 09/27/05 12:22 PM >>>


You may wish to check netbsd-cobalt archives for data corruption =20
topics.  There was an extended period of people with this exact =20
problem; it did get resolved; I'm running
  -current on my MIPS machine and it's looking OK at this time.

I don't recall seeing a similar issue in the intel- flavor but that =20
doesn't mean it wasn't brought up here as well.

hope that helps; a little anyway.

-Brian


On Sep 26, 2005, at 8:53 PM, Daniel Cox wrote:

> I have recently replaced a machine with completely new hardware =20
> (including box/disk/cpu/ram/mb) and upgraded from NetBSD 1.6.2 to =20
> NetBSD 2.0.2 to try and fix some data corruption issues.
>
> It appeared to be working fine but I am noticing similar problems =20
> again with nightly backup files. Specifically: invalid compressed =20
> data--crc error - when testing dump and tar files before burning to =20
> CD.
>
> The system otherwise works perfectly, never crashes and user files =20
> do not appear to be corrupt (yet).
>
> I have just completed some further testing :-
> - setup a new partition wd0k 15G and run: newfs /dev/wd0k
> - mounted with AND without softdep (newfs each time)
> - create a tar file (t.tar) with about 170MB data from /home
> - gzip/gunzip test
>
> gzip -c t.tar | tee a.tgz | gunzip -t
> gzip -c t.tar | tee b.tgz | gunzip -t
>
> This command returns no error both times which suggests that the =20
> compression and decompression is fine and so is system memory.
>
> BUT: md5 a.tgz b.tgz
> Returns a different checksum for each (note: size is the same)!
> MD5 (a.tgz) =3D cff3aa516877ec55112ec671c2934afd
> MD5 (b.tgz) =3D 274d02fa6fe1db51b6b92779fbf2f8ce
>
> Neither of the saved files can be uncompressed - CRC error.
>
> AND: cp b.tgz c.tgz
> Returns a different checksum again!
> MD5 (c.tgz) =3D da7972fea5cbef5d78a3410b32b8d436
>
> A much smaller tar file containing just /etc/fstab works fine with =20
> these tests.
>
> It seems that reading the data is not the problem. md5 always =20
> returns the same result for same filename.
>
> What can I do/use to try and trace this problem?
>
> The fact that a similar problem has occured on 2 different sets of =20
> hardware suggests there may be a bug here that could be affecting =20
> others.
> I think I need a bit more detail before a bug report can be filed =20
> though.
>
> Am I posting to the correct list?
>
> Thankyou for any help you may be able to provide
> Daniel Cox.
>
>
>
>