Subject: File system corruption
To: None <current-users@netbsd.org>
From: Tim Preston <tim@flibble.org>
List: current-users
Date: 03/28/2001 16:13:11
I've been seeing file system corruption on a regular basis since I build
my -current playbox using 1.5S a couple of months ago.

This is on a RAID1 file system running of IDE disks, with softdep on.

Interestingly I only see this problem on one of the file systems, and
it's the one I ran fsirand on as it is likely to be NFS exported at some
point, however it contains all my source and and obj trees and so it's
by far the most heavily used filesystem on the box so this is likely
coincidence...

I'm not seeing any disk errors reported by the kernel, userland in from
the same date as the kernel.

Anyone got any idea how I can start working out what is causing this?

Some (hopefully relevant) details

This is the sort of thing I'm seeing, note that the filesysten is
currently marked as clean, this isn't being caused by crashes, also it's
unmounted because my CVS update just failed because of the state of the
disk and I'm about to fix the damn thing (again).


=============================================================

root@katrina:~# fsck -fn /usr/export
** /dev/rraid0h (NO WRITE)
** File system is already clean
** Last Mounted on /usr/export
** Phase 1 - Check Blocks and Sizes
PARTIALLY ALLOCATED INODE I=220
CLEAR? no

PARTIALLY ALLOCATED INODE I=119396
CLEAR? no

UNKNOWN FILE TYPE I=197668
CLEAR? no

PARTIALLY ALLOCATED INODE I=205359
CLEAR? no

PARTIALLY ALLOCATED INODE I=205360
CLEAR? no

PARTIALLY ALLOCATED INODE I=396996
CLEAR? no

INCORRECT BLOCK COUNT I=453639 (0 should be 8)
CORRECT? no

PARTIALLY ALLOCATED INODE I=703224
CLEAR? no

PARTIALLY ALLOCATED INODE I=706884
CLEAR? no

INCORRECT BLOCK COUNT I=735029 (48 should be 72)
CORRECT? no

INCORRECT BLOCK COUNT I=735031 (0 should be 48)
CORRECT? no

PARTIALLY ALLOCATED INODE I=852108
CLEAR? no

PARTIALLY ALLOCATED INODE I=1074852
CLEAR? no

PARTIALLY ALLOCATED INODE I=1411156
CLEAR? no

PARTIALLY ALLOCATED INODE I=1411244
CLEAR? no

** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
BAD/DUP FILE I=197668  OWNER=root MODE=11371
SIZE=0 MTIME=Jan  1 01:00 1970 
CLEAR? no

LINK COUNT DIR I=617230  OWNER=root MODE=40755
SIZE=512 MTIME=Mar 28 15:28 2001  COUNT 4 SHOULD BE 2
ADJUST? no

UNREF FILE  I=617489  OWNER=root MODE=100644
SIZE=44 MTIME=Feb 16 02:23 2001 
RECONNECT? no


CLEAR? no

UNREF FILE  I=617490  OWNER=root MODE=100644
SIZE=41 MTIME=Feb 16 02:23 2001 
RECONNECT? no


CLEAR? no

UNREF FILE  I=617491  OWNER=root MODE=100644
SIZE=221 MTIME=Feb 16 02:23 2001 
RECONNECT? no


CLEAR? no

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? no

SUMMARY INFORMATION BAD
SALVAGE? no

BLK(S) MISSING IN BIT MAPS
SALVAGE? no

285912 files, 794473 used, 2235846 free (80254 frags, 269449 blocks, 2.6% fragmentation)

-------------------------------------------------------------

root@katrina:~# dmesg
NetBSD 1.5T (KATRINA.4MB) #11: Mon Mar 26 14:58:50 BST 2001 tim@katrina.flibble.org:/usr/export/src/sys/arch/i386/compile/KATRINA.4MB
cpu0: AMD Athlon Model 4 (Thunderbird) (686-class), 900.10 MHz
cpu0: features 183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features 183f9ff<PGE,MCA,CMOV,FGPAT,PSE36,MMX,FXSR>
total memory = 255 MB
avail memory = 228 MB
using 3296 buffers containing 13184 KB of memory
BIOS32 rev. 0 found at 0xfb400
PCI BIOS rev. 2.1 found at 0xfb430
PCI IRQ Routing Table rev. 1.0 found at 0xfddd0, size 176 bytes (9 entries)
PCI Interrupt Router at 000:07:0 (VIA Technologies VT82C596A (Apollo Pro) PCI-ISA Bridge)
PCI Exclusive IRQs: 5 9 10 11
mainbus0 (root)
pnpbios0 at mainbus0: nodes 15, max len 98
pckbc0 at pnpbios0 index 4 (PNP0303): kbd port
pckbc1 at pnpbios0 index 10 (PNP0F13): aux port
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
fdc0 at pnpbios0 index 12 (PNP0700)
fdc0: io 3f0-3f5 3f7, irq 6, dma 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
com0 at pnpbios0 index 13 (PNP0501)
com0: io 3f8-3ff, irq 4
com0: ns16550a, working fifo
com1 at pnpbios0 index 14 (PNP0501)
com1: io 2f8-2ff, irq 3
com1: ns16550a, working fifo
lpt0 at pnpbios0 index 16 (PNP0401)
lpt0: io 378-37f 778-77f, irq 7, dma 3
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: VIA Technologies VT8363 KT133 System Controller (rev. 0x02)
ppb0 at pci0 dev 1 function 0: VIA Technologies VT8363 KT133 PCI to AGP Bridge (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga1 at pci1 dev 0 function 0: Nvidia Corporation GeForce2 GTS (rev.  0xa4)
uhci0: interrupting at irq 10
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 7 function 3: VIA Technologies VT83C572 USB Controller (rev. 0x10)
uhci1: interrupting at irq 10
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA Technologie UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
viapm0 at pci0 dev 7 function 4
fxp0 at pci0 dev 8 function 0: Intel i82557 Ethernet, rev 2
fxp0: interrupting at irq 10
fxp0: Ethernet address 00:a0:c9:6f:e1:f3, 10/100 Mb/s
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 0
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
Creative Labs SBLive! EMU 10000 (audio multimedia, revision 0x08) at pci0 dev 11 function 0 not configured
Creative Labs PCI Gameport Joystick (miscellaneous input, revision 0x08) at pci0 dev 11 function 1 not configured
bktr0 at pci0 dev 15 function 0
bktr0: interrupting at irq 9
bktr0: Hauppauge Model 44354 A242
bktr0: Detected a MSP3415D-B3 at 0x80
bktr0: Hauppauge WinCast/TV, Philips FR1216 PAL FM tuner, msp3400c stereo, remote control.
Brooktree product 0x0878 (miscellaneous multimedia, revision 0x02) at pci0 dev 15 function 1 not configured
pciide1 at pci0 dev 19 function 0: Triones/Highpoint HPT366/370 IDE Controller (rev. 0x03)
pciide1: bus-master DMA support present
pciide1: primary channel wired to native-PCI mode
pciide1: using irq 11 for native-PCI interrupt
wd2 at pciide1 channel 0 drive 0: <IBM-DTLA-305020>
wd2: drive supports 16-sector PIO transfers, LBA addressing
wd2: 19623 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 40188960 sectors
wd2: 32-bit data port
wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd3 at pciide1 channel 0 drive 1: <IBM-DTLA-305020>
wd3: drive supports 16-sector PIO transfers, LBA addressing
wd3: 19623 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 40188960 sectors
wd3: 32-bit data port
wd3: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd2(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd3(pciide1:0:1): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
pciide1: secondary channel wired to native-PCI mode
wd4 at pciide1 channel 1 drive 0: <IBM-DTLA-307030>
wd4: drive supports 16-sector PIO transfers, LBA addressing
wd4: 29314 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 60036480 sectors
wd4: 32-bit data port
wd4: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd5 at pciide1 channel 1 drive 1: <IBM-DTLA-307030>
wd5: drive supports 16-sector PIO transfers, LBA addressing
wd5: 29314 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 60036480 sectors
wd5: 32-bit data port
wd5: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd4(pciide1:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd5(pciide1:1:1): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
isa0 at pcib0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0-0xff: using exception 16
viaenv0 at viapm0
apm0 at mainbus0: Power Management spec V1.2 (slowidle)
apm0: A/C state: on
apm0: battery charge state: no battery
biomask ef65 netmask ef65 ttymask ffe7
Kernelized RAIDframe activated
wd0: no disk label
wd2: no disk label
wd3: no disk label
IPsec: Initialized Security Association Processing.
RAID autoconfigure
Configuring raid0:
RAIDFRAME: protectedSectors is 64
RAIDFRAME: Configure (RAID Level 1): total number of sectors is 59532416 (29068 MB)
RAIDFRAME(RAID Level 1): Using 6 floating recon bufs with no head sep limit
uhid0 at uhub0 port 1 configuration 1 interface 0
uhid0: Microsoft SideWinder Game Voice, rev 1.10/1.01, addr 2, iclass 3/0
ugen0 at uhub0 port 2
ugen0: OmniVision OV511+ Camera, rev 1.00/1.00, addr 3
wd0: no disk label
wd2: no disk label
wd3: no disk label
boot device: raid0
root on raid0a dumps on raid0b
root file system type: ffs
raid0: Device already configured!

-------------------------------------------------------------

root@katrina:~# disklabel wd4
# /dev/rwd4d:
type: ESDI
disk: IBM-DTLA-307030 
label: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 60036480
rpm: 7200
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0 

8 partitions:
#        size   offset     fstype   [fsize bsize cpg/sgs]
  a:   503937       63     4.2BSD     1024  8192    16   # (Cyl.    0*- 499)
  c: 60036417       63     unused        0     0         # (Cyl.    0*- 59559)
  d: 60036480        0     unused        0     0         # (Cyl.    0 - 59559)
  e: 59532480   504000       RAID                        # (Cyl.  500 - 59559)

-------------------------------------------------------------

root@katrina:~# raidctl -s raid0
Components:
           /dev/wd4e: optimal
           /dev/wd5e: optimal
No spares.
Component label for /dev/wd4e:
   Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2001022201, Mod Counter: 1573
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 59532416
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Component label for /dev/wd5e:
   Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
   Version: 2, Serial Number: 2001022201, Mod Counter: 1573
   Clean: No, Status: 0
   sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
   Queue size: 100, blocksize: 512, numBlocks: 59532416
   RAID Level: 1
   Autoconfig: Yes
   Root partition: Yes
   Last configured as: raid0
Parity status: clean
Reconstruction is 100% complete.
Parity Re-write is 100% complete.
Copyback is 100% complete.

-------------------------------------------------------------

root@katrina:~# disklabel raid0
# /dev/rraid0d:
type: RAID
disk: raid
label: 
flags:
bytes/sector: 512
sectors/track: 128
tracks/cylinder: 1
sectors/cylinder: 128
cylinders: 465097
total sectors: 59532416
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0 

8 partitions:
#        size   offset     fstype   [fsize bsize cpg/sgs]
  a:   528640        0     4.2BSD     4096 32768    32   # (Cyl.    0 - 4129)
  b:  4198912   528640       swap                        # (Cyl. 4130 - 36933)
  d: 59532416        0     unused        0     0         # (Cyl.    0 - 465096)
  e:  4198912  4727552     4.2BSD     4096 32768    32   # (Cyl. 36934 - 69737)
  f:  4198912  8926464     4.2BSD     4096 32768    32   # (Cyl. 69738 - 102541)
  g: 20971904 13125376     4.2BSD     4096 32768    32   # (Cyl. 102542 - 266384)
  h: 25435136 34097280     4.2BSD     4096 32768    32   # (Cyl. 266385 - 465096)

-------------------------------------------------------------

root@katrina:~# mount
/dev/raid0a on / type ffs (local, soft dependencies)
/dev/raid0f on /var type ffs (local, soft dependencies)
/dev/raid0e on /usr type ffs (local, soft dependencies)
/dev/raid0g on /home type ffs (local, soft dependencies)
mfs:133 on /tmp type mfs (asynchronous, local)
kernfs on /kern type kernfs (local)
procfs on /proc type procfs (local)
/dev/raid0h on /usr/export type ffs (local, soft dependencies)

=============================================================



-- 
I am become Reginald, purveyor of fitted kitchens...