NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/41024: wapbl causes file system corruption
>Number: 41024
>Category: kern
>Synopsis: wapbl causes file system corruption
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon Mar 16 07:10:00 +0000 2009
>Originator: Alan Barrett
>Release: NetBSD 5.99.8
>Organization:
Not much
>Environment:
System: NetBSD 5.99.8 i386
>Description:
I have an external USB disk that I use for backups. Very frequently,
while attempting to make a backup, the system panics, usually with a
message like this:
/mnt: bad dir ino 16170501 at offset 0: mangled entry
panic: bad dir
The file system is ffs+wapbl on cgd. The kernel includes the recent
change to make cgd pass the DIOCCACHESYNC ioctl through to the
underlying disk (see PR 41016).
Backups are made using rsync, but I can replicate the panics simply
using find(1) to read the file system.
The disk and its parents are attached as follows:
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x27cc (rev. 0x01)
ehci0: interrupting at ioapic0 pin 20
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
usb4 at ehci0: USB revision 2.0
uhub2 at usb4: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub2: 8 ports with 8 removable, self powered
umass0 at uhub2 port 6 configuration 1 interface 0
umass0: Western Digital External HDD, rev 2.00/2.06, addr 5
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: <WD, 3200JB External, 0107> disk fixed
sd0: fabricating a geometry
sd0: 298 GB, 305245 cyl, 64 head, 32 sec, 512 bytes/sect x 625142448 sectors
sd0: fabricating a geometry
Most of the disk space is allocated to the sd0e partition, which is
configured as a cgd device (cgd2 in the following description).
The whole of the cgd2 device is allocated to the cgd2a file system,
which is formatted as ffs+wapbl.
>How-To-Repeat:
Ensure that the file system is clean:
$ fsck -f -y -P /dev/rcgd2a
[fsck fixes several problems from a previous crash]
$ fsck -f -y -P /dev/rcgd2a
[no problems]
Verify that mounting without wapbl does not cause problems:
$ mount -o nolog /dev/cgd2a /mnt
$ find /mnt -type d -print | tail
[no problems]
$ umount /mnt
$ fsck -f -y -P /dev/rcgd2a
[no problems]
Verify that mounting with wapbl + noatime does not cause problems:
$ mount -o log,noatime /dev/cgd2a /mnt
$ find /mnt -type d -print | tail
[no problems]
$ umount /mnt
$ fsck -f -y -P /dev/rcgd2a
[no problems]
Verify that mounting with wapbl causes a crash:
$ mount -o log /dev/cgd2a /mnt
$ find /mnt -type d -print | tail
/mnt: bad dir ino 16170501 at offset 0: mangled entry
panic: bad dir
[...]
stoped in pid 14915.1 (find) [...]
Reboot and examine the crash dump:
$ crash -M netbsd.203.core -N netbsd.203
crash> bt
[...]
panic(c0ad6b07,ce09d0f8,f6be05,0,0,c0a7f762,200,d771967c,0,dbcaf000) at
0xc06f25fa
ufs_dirbad(dadc3e84,0,c0a7f762,0,cca07aa8,0,0,0,0,c4335218) at 0xc076dc1a
ufs_lookup(cca07ad8,1,cca07adc,c07e372d,d771967c,c0a51140,d771967c,cca07c14,cca07c28,d771967c)
at 0xc076e60b
VOP_LOOKUP(d771967c,cca07c14,cca07c28,c07f4890,cca07b1c,1000,1,0,20,0) at
0xc07f56ec
lookup(cca07c00,20002,400,cca07c1c,1,c3a88000,cca07c1c,c0771adf,c0bab0a0,c0b3f5c0)
at 0xc07d850c
namei(cca07c00,e00,cca07bfc,c07e372d,d771967c,ce09d000,cca07c1c,bb916090,0,0)
at 0xc07d8c0d
do_sys_stat(bb916090,0,cca07c68,cca07c98,0,0,0,ceb57760,cca07c90,1) at
0xc07dff47
sys___lstat50(cf027000,cca07d00,cca07d28,bb916090,bb9160ac,bfbfeb98,bbb2a66d,1,bb268c80,804ee6c)
at 0xc07dffac
syscall(cca07d48,bb9200b3,bb9000ab,bb90001f,bfbf001f,bb9160ac,bb916040,bfbfebf8,bbbc21dc,bb916040)
at 0xc0711e7d
crash> quit
Examime the file system:
$ fsdb -f /dev/rcgd2a
fsdb (inum: 2)> inode 16170501
current inode: directory
I=16170501 MODE=40755 SIZE=512
MTIME=Mar 30 23:23:50 2005 [0 nsec]
CTIME=Feb 27 15:21:12 2009 [658332227 nsec]
ATIME=Mar 15 21:09:23 2009 [866735678 nsec]
OWNER=apb GRP=apb LINKCNT=2 FLAGS=0x0 BLKCNT=0x4 GEN=0x72b4af9f
fsdb (inum: 16170501)> ls
fsdb (inum: 16170501)> blks
I=16170501 4 blocks
Direct blocks:
0: 65573431
fsdb (inum: 16170501)> quit
*** FILE SYSTEM MARKED DIRTY
*** BE SURE TO RUN FSCK TO CLEAN UP ANY DAMAGE
*** IF IT WAS MOUNTED, RE-MOUNT WITH -u -o reload
$ dd if=/dev/rcgd2a bs=512 skip=65573431 count=1 | hexdump -C
1+0 records in
1+0 records out
512 bytes transferred in 0.010 secs (51200 bytes/sec)
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000200
$ tunefs -N /dev/rcgd2a
tunefs: tuning /dev/rcgd2a
tunefs: current settings of /dev/rcgd2a
maximum contiguous block count 4
maximum blocks per file in a cylinder group 4096
minimum percentage of free space 5%
optimization preference: time
average file size: 16384
expected number of files per directory: 64
journal log file location: in filesystem
journal log file size: 64MB (67108864 bytes)
journal log flags:
tunefs: no changes made
$ dumpfs -s /dev/rcgd2a
file system: /dev/rcgd2a
endian little-endian
magic 11954 (UFS1) time Sun Mar 15 22:40:18 2009
superblock location 8192 id [ 4960ee56 5d043c18 ]
cylgrp dynamic inodes 4.4BSD sblock FFSv2 fslevel 4
nbfree 11412090 ndir 642379 nifree 30237094 nffree 947460
ncg 1391 size 131285595 blocks 129238024
bsize 16384 shift 14 mask 0xffffc000
fsize 2048 shift 11 mask 0xfffff800
frag 8 shift 3 fsbtodb 2
bpg 11798 fpg 94384 ipg 23296
minfree 5% optim time maxcontig 4 maxbpg 4096
symlinklen 60 contigsumsize 4
maxfilesize 0x000400400402ffff
nindir 4096 inopb 128
avgfilesize 16384 avgfpdir 64
sblkno 8 cblkno 16 iblkno 24 dblkno 1480
sbsize 2048 cgsize 16384
csaddr 1480 cssize 22528
cgrotor 0 fmod 0 ronly 0 clean 0x00
wapbl version 0x1 location 2 flags 0x0
wapbl loc0 262179520 loc1 131072 loc2 512 loc3 3
flags none
fsmnt /mnt
volname swuid 0
>Fix:
Unknown. I will keep the crash dump for some weeks or months, and I
will keep the file system unmodified for a few days in case anybody
needs information from them.
Home |
Main Index |
Thread Index |
Old Index