Subject: Re: data corruption using gzip
To: None <jsarkes@tiac.net>
From: Todd Kover <kovert@omniscient.com>
List: current-users
Date: 03/09/2001 10:51:51
 > > It's miscfs/genfs/genfs_vnops.c
 > > 
 > > Try updating it to the latest version.
 > 
 > I have done so. Unfortunately, I am still having
 > problems, using bzip2 also. Apparently I have some
 > type of hardware problem that manifests itself 
 > under a tar -czf whatever.gz src doc pkgsrc xsrc

It appears that I'm not nuts...

Here's my story:

I have a ~1mo old system (asus a7v, athalon 1k, 1 DIMM of 256m pc133
ECC memory, 2 IBM 45g 7200 IDE disks on the primary ATA/100 IDE bus and
one 18g seagate LVD drive on an adaptec LVD contoller and a DLT4k on
another adaptec scsi-2 controller there's also an unused by netbsd pci
soundblaster live, an intel etherexpress pro10/100+ card, and a matrox
g450 video card, so I'm not accused of being incomplete).

I've seen this under 1.5-release and a kernel from -current as of the
morning of 3/5. (which appears to have the latest genfs_vnops.c).

I've specifically found gunzip's to be unreliable, rather than gzip (or,
I know I'm seeing problems with gunziping), as well as some other anomolies
working with largish (from several hundred k to 3gb) filesets.

I've got an older system that I rdump'd to this one, and a restored .gz
file from that dump fails to gunzip (and also fails to gunzip when that
file system is nfs mounted to the older system it was created on).

I've found gzcat's of gzip'd smbtar's generated locally to have problems
uncompressing.  I've found gzcat's of tar.gz's generated over nfs to
have problems, too.  I've also seen gzip's of samba dumps have trouble
uncompressing. (this box is also my amanda host).

I've also had one example of doing a '(tar cf - . ) | (cd ~/foo ; tar xpf -)'
corrupt a source file contained in the copy.

I've found that I was able to get the problem to not happen once on the
LVD drive, but only once. (/, swap, /usr are all on the LVD drive as
well as home directories).  Copying this same dump to one of either of
the IDE disks and restoring the .gz file yielded something that gzcat
found to be corrupt.  (this is an ~3gb dump image).

I had some large gzip'd pnm files on an indigo^2 running irix that I
copied over to the LVD drive (tar cf - | ssh tar xpf ) and one came over
corrupted, that I was able to copy again and did not end up corrupted.

And, I downloaded a tarball of oracle 8.1.7 2 for linux days ago on a win
box, copied over to this netbsd box, extracted it, ran mkisofs on it to
generate a CD, copied it BACK over to the win box to burn a CD of it, and
ended up getting an image that's different from taking the same original
tarball and extracting it on the linux box destined to run oracle.
(so says diff -dru, anyway).

I also had a 3com 3c905C-TX in it that would get a bogus ethernet
address on bootup (didn't have the bogus address for the 2 hrs I had
loaded freebsd on it before I put 1.5 on). that I swapped in an intel
etherexpress for but that does not appear to have made a difference
(since many of my problems were manifesting themselves on things that
came to the machine over the network).

FWIW, I've got a similarly configured a7v system (no scsi, 3com netcard)
that runs win2k server that's as stable as you expect windows systems to
be.  my next step was going to be swapping the system board and memory
over.

It sounds like it may not just be my system that's anomolies with their
a7v's, however, so now I'm not sure what the right thing to do is.

-Todd