Subject: kern/37590: Writing data to a filesystem on an external USB drive fails
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <rillig@NetBSD.org>
List: netbsd-bugs
Date: 12/21/2007 19:40:00
>Number:         37590
>Category:       kern
>Synopsis:       Writing data to a filesystem on an external USB drive fails
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Dec 21 19:40:00 +0000 2007
>Originator:     Roland Illig
>Release:        NetBSD 4.99.30
>Organization:
>Environment:
NetBSD bacc.roland-illig.de 4.99.30 NetBSD 4.99.30 (GENERIC) #2: Fri Aug 31 20:40:16 CEST 2007  build@bacc.roland-illig.de:/home/scratch/build/NetBSD/2007-08/work/sys/arch/i386/compile/GENERIC i386

>Description:
I have an external USB disk:

pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
NVIDIA nForce MCP55 Memory Controller (RAM memory, revision 0xa1) at pci0 dev 0 function 0 not configured
...
ehci0 at pci0 dev 2 function 1: NVIDIA nForce MCP55 EHCI USB Controller (rev. 0xa2)
APCL: Picked IRQ 21 with weight 0
ehci0: interrupting at ioapic0 pin 21 (irq 10)
ehci0: BIOS has given up ownership
ehci0: EHCI version 1.0
ehci0: companion controller, 10 ports each: ohci0
usb1 at ehci0: USB revision 2.0
uhub1 at usb1
uhub1: NVIDIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 10 ports with 10 removable, self powered
...
umass0 at uhub1 port 9 configuration 1 interface 0
umass0: Western Digital External HDD, rev 2.00/1.06, addr 2
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: <WD, 5000AAK External, 1.06> disk fixed
sd0: 465 GB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 976773168 sectors

When I try to write some files on it and read them back, the data of some sectors are changed. Writing directly onto the disk (without filesystems) works.

>How-To-Repeat:

# disklabel sd0 | tail -n 4
#        size    offset     fstype [fsize bsize cpg/sgs]
 c: 976773105        63     unused      0     0        # (Cyl.      0*- 969020)
 d: 976773168         0     unused      0     0        # (Cyl.      0 - 969020)
 e: 976773105        63     4.2BSD      0     0     0  # (Cyl.      0*- 969020)

newfs /dev/rsd0e
mount_ffs /dev/sd0e /mnt/backup

bacc:~ # dd if=/home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar of=/mn                                                                    t/backup/home2.tar bs=1048576 count=128
128+0 records in
128+0 records out
134217728 bytes transferred in 7.360 secs (18236104 bytes/sec)
bacc:~ # cmp -l /home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar /mnt/                                                                     backup/home2.tar | sed 10q
cmp: EOF on /mnt/backup/home2.tar: char 134217729, line 2098791

bacc:~ # dd if=/home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar of=/mn                                                                    t/backup/home2.tar bs=1048576 count=256
256+0 records in
256+0 records out
268435456 bytes transferred in 15.730 secs (17065191 bytes/sec)
bacc:~ # cmp -l /home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar /mnt/                                                                     backup/home2.tar | sed 10q
cmp: EOF on /mnt/backup/home2.tar: char 268435457, line 3274554

bacc:~ # dd if=/home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar of=/mn                                                                     t/backup/home2.tar bs=1048576 count=512
512+0 records in
512+0 records out
536870912 bytes transferred in 32.433 secs (16553230 bytes/sec)
bacc:~ # cmp -l /home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar /mnt/                                                                     backup/home2.tar | sed 10q
909313  65   0
909314   0  57
909315   0 150
909316   0 157
909317  66 155
909318   0 145
909319   0  57
909320   0 162
909321  64 157
909322   0 154

bacc:~ # dd if=/home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar of=/mn                                                                     t/backup/home2.tar bs=1048576 count=511
511+0 records in
511+0 records out
535822336 bytes transferred in 32.208 secs (16636311 bytes/sec)
bacc:~ # cmp -l /home/scratch/roland/backup/2006-09-02/home-2006-09-02.tar /mnt/                                                                      backup/home2.tar | sed 10q
3399681 225 164
3399682   2 141
3399683   0 154
3399684   0 157
3399685 226 147
3399686   2   0
3399687   0 153
3399688   0 144
3399689  37 145
3399690   4 154

>Fix:
I have no idea.

In 2008, I will write a hard disk checker, analogous to memtest, to see whether it is the disk or the filesystem, but I strongly suspect the latter to be the failing cause, since writing directly to /dev/rsd0d works fine.