Subject: bin/37432: fsck does not appear to fix the block bit maps
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <he@NetBSD.org>
List: netbsd-bugs
Date: 11/25/2007 15:05:01
>Number:         37432
>Category:       bin
>Synopsis:       fsck does not appear to fix the block bit maps
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Nov 25 15:05:00 +0000 2007
>Originator:     Havard Eidnes
>Release:        NetBSD 4.99.37 of Nov 24 2007
>Organization:
	I Try
>Environment:
System: NetBSD  4.99.37 NetBSD 4.99.37 (GENERIC.MP) #2: Sat Nov 24 17:39:11 CET 2007  he@ctd.urc.uninett.no:/usr/obj/sys/arch/i386/compile/GENERIC.MP i386
Architecture: i386
Machine: i386
>Description:
	Recently this system has crashed a few times with

		panic: ffs_alloccg: map corrupted

	I typically reboot with "reboot 4" from DDB, and let fsck
	sort out the inconsistencies on reboot.  This last time it
	said:

/dev/rsd0a: 891426 files, 5989990 used, 1335633 free (26297 frags, 163667 blocks, 0.4% fragmentation)
/dev/rsd0a: MARKING FILE SYSTEM CLEAN
/dev/rsd1a: INCORRECT BLOCK COUNT I=11252450 (20 should be 0) (CORRECTED)
/dev/rsd1a: INCORRECT BLOCK COUNT I=12139581 (544 should be 416) (CORRECTED)
/dev/rsd1a: UNREF FILE I=11252450  OWNER=he MODE=100644
/dev/rsd1a: SIZE=0 MTIME=Nov 25 03:10 2007  (CLEARED)
/dev/rsd1a: LINK COUNT FILE I=11255593  OWNER=he MODE=100644
/dev/rsd1a: SIZE=9400 MTIME=Nov 25 03:10 2007  COUNT 2 SHOULD BE 1 (ADJUSTED)
/dev/rsd1a: BLK(S) MISSING IN BIT MAPS (SALVAGED)
/dev/rsd1a: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED)
/dev/rsd1a: SUMMARY INFORMATION BAD (SALVAGED)
/dev/rsd1a: 1613673 files, 14139985 used, 56572121 free (714433 frags, 6982211 blocks, 1.0% fragmentation)
/dev/rsd1a: MARKING FILE SYSTEM CLEAN

	The panic message was

start = 2289, len = 9510, fs = /u
offset=3086 3086
cg 521
panic: ffs_alloccg: map corrupted
Stopped in pid 4951.1 (as) at   netbsd:breakpoint+0x1:  ret
db{0}: reboot 4

	so this concerns sd1a.

	However, since this panic has happened a few times now, I decided
	to run "fsck -f" by hand to see if it found any more problems,
	and indeed the "BLK(S) MISSING IN BIT MAPS" error turned up again:

# fsck -f /u
** /dev/rsd1a
** File system is already clean
** Last Mounted on /u
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

1613673 files, 14139985 used, 56572121 free (714433 frags, 6982211 blocks, 1.0% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****
#

	However, rerunning "fsck -f /u" after this reveals this same error:

# fsck -f /u
** /dev/rsd1a
** File system is already clean
** Last Mounted on /u
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

1613673 files, 14139985 used, 56572121 free (714433 frags, 6982211 blocks, 1.0% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****

	Re-doing this a third and fourth time just gives the same
	result again.

	So even though it's been instructed to fix the problem, it
	appears unable to do so.

	It should be said that the user-land on this machine is of
	4.99.31 vintage, but fsck does not appear to have changed
	since then.

	I've tried several more "fsck -f -P -y /u" on this system,
	one of them didn't complain about "BLK(S) MISSING IN BIT MAPS",
	but on a subsequent run the error message and the promised
	fixup returned.

	The boot messages from this system follow:

booting hd0a:netbsd (howto 0x2)
9286908+424452+366444 [471520+457564]=0xa81228
BIOS CFG: Model-SubM-Rev: fc-01-00, 0x74<EBDA,KBDINT,RTC,IC2>
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 4.99.37 (GENERIC.MP) #2: Sat Nov 24 17:39:11 CET 2007
        he@ctd.urc.uninett.no:/usr/obj/sys/arch/i386/compile/GENERIC.MP
total memory = 2047 MB
rbus: rbus_min_start set to 0x80000000
avail memory = 1998 MB
mainbus0 (root)
ACPI Error (tbxfroot-0775): No valid RSDP was found [20060217]
ACPI Exception (tbxfroot-0531): AE_NOT_FOUND, RSDP structure not found - Flags=8 [20060217]
ACPI Exception (tbxface-0162): AE_NO_ACPI_TABLES, Could not get the RSDP [20060217]
ACPI Exception (tbxface-0211): AE_NO_ACPI_TABLES, Could not load tables [20060217]
ACPI: unable to load tables: AE_NO_ACPI_TABLES
mainbus0: Intel MP Specification (Version 1.4) (         Kings Canyon)
cpu0 at mainbus0 apid 0: (boot processor)
cpu0: Intel Xeon (686-class), 2196.33 MHz, id 0xf24
cpu0: "Intel(R) XEON(TM) CPU 2.20GHz"
cpu1 at mainbus0 apid 6: (application processor)
cpu1: Intel Xeon (686-class), 2196.39 MHz, id 0xf24
cpu1: "Intel(R) XEON(TM) CPU 2.20GHz"
cpu2 at mainbus0 apid 1: (application processor)
cpu2: Intel Xeon (686-class), 2196.43 MHz, id 0xf24
cpu2: "Intel(R) XEON(TM) CPU 2.20GHz"
cpu3 at mainbus0 apid 7: (application processor)
cpu3: Intel Xeon (686-class), 2196.43 MHz, id 0xf24
cpu3: "Intel(R) XEON(TM) CPU 2.20GHz"
mpbios: bus 0 is type PCI   
mpbios: bus 1 is type PCI   
mpbios: bus 2 is type PCI   
mpbios: bus 3 is type PCI   
mpbios: bus 4 is type PCI   
mpbios: bus 5 is type PCI   
mpbios: bus 6 is type PCI   
mpbios: bus 7 is type PCI   
mpbios: bus 8 is type ISA   
ioapic0 at mainbus0 apid 2
ioapic1 at mainbus0 apid 3
ioapic2 at mainbus0 apid 4
ioapic3 at mainbus0 apid 5
ioapic4 at mainbus0 apid 8
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Intel E7500 MCH Host (rev. 0x02)
Intel E7500 MCH DRAM Controller (undefined subclass 0x00, revision 0x02) at pci0 dev 0 function 1 not configured
ppb0 at pci0 dev 2 function 0: Intel E7500 MCH HI_B vppb 1 (rev. 0x02)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
Intel 82870P2 P64H2 IOxAPIC (interrupt system, interface 0x20, revision 0x03) at pci1 dev 28 function 0 not configured
ppb1 at pci1 dev 29 function 0: Intel 82870P2 P64H2 PCI-PCI Bridge (rev. 0x03)
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled
ciss0 at pci2 dev 2 function 0: Compaq Smart Array 5300 V1
ciss0: interrupting at ioapic2 pin 4 (irq 11)
ciss0: 2 LDs, HW rev 0, FW 1.76/1.76
scsibus0 at ciss0: 2 targets, 8 luns per target
Intel 82870P2 P64H2 IOxAPIC (interrupt system, interface 0x20, revision 0x03) at pci1 dev 30 function 0 not configured
ppb2 at pci1 dev 31 function 0: Intel 82870P2 P64H2 PCI-PCI Bridge (rev. 0x03)
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled
ppb3 at pci0 dev 3 function 0: Intel E7500 MCH HI_C vppb 1 (rev. 0x02)
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled
Intel 82870P2 P64H2 IOxAPIC (interrupt system, interface 0x20, revision 0x03) at pci4 dev 28 function 0 not configured
ppb4 at pci4 dev 29 function 0: Intel 82870P2 P64H2 PCI-PCI Bridge (rev. 0x03)
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled
Intel 82870P2 P64H2 IOxAPIC (interrupt system, interface 0x20, revision 0x03) at pci4 dev 30 function 0 not configured
ppb5 at pci4 dev 31 function 0: Intel 82870P2 P64H2 PCI-PCI Bridge (rev. 0x03)
pci6 at ppb5 bus 6
pci6: i/o space, memory space enabled
ahc1 at pci6 dev 2 function 0: Adaptec aic7899 Ultra160 SCSI adapter
ahc1: interrupting at ioapic3 pin 4 (irq 5)
ahc1: aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsibus1 at ahc1: 16 targets, 8 luns per target
ahc2 at pci6 dev 2 function 1: Adaptec aic7899 Ultra160 SCSI adapter
ahc2: interrupting at ioapic3 pin 5 (irq 5)
ahc2: aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
scsibus2 at ahc2: 16 targets, 8 luns per target
uhci0 at pci0 dev 29 function 0: Intel 82801CA USB Controller (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16 (irq 11)
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 29 function 1: Intel 82801CA USB Controller (rev. 0x02)
uhci1: interrupting at ioapic0 pin 19 (irq 10)
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2 at pci0 dev 29 function 2: Intel 82801CA USB Controller (rev. 0x02)
uhci2: interrupting at ioapic0 pin 18 (irq 7)
usb2 at uhci2: USB revision 1.0
uhub2 at usb2
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ppb6 at pci0 dev 30 function 0: Intel 82801BA Hub-PCI Bridge (rev. 0x42)
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled
vga1 at pci7 dev 1 function 0: ATI Technologies Rage XL (rev. 0x27)
wsdisplay0 at vga1 kbdmux 1
direct rendering for vga1 unsupported
fxp0 at pci7 dev 2 function 0: i82550 Ethernet, rev 13
fxp0: interrupting at ioapic0 pin 17 (irq 3)
fxp0: Ethernet address 00:30:48:12:22:15
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp1 at pci7 dev 3 function 0: i82550 Ethernet, rev 13
fxp1: interrupting at ioapic0 pin 18 (irq 7)
fxp1: Ethernet address 00:30:48:12:25:68
inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcib0 at pci0 dev 31 function 0
pcib0: Intel 82801CA LPC Interface (rev. 0x02)
piixide0 at pci0 dev 31 function 1
piixide0: Intel 82801CA IDE Controller (ICH3) (rev. 0x02)
piixide0: primary channel interrupting at ioapic0 pin 14 (irq 14)
atabus0 at piixide0 channel 0
piixide0: secondary channel interrupting at ioapic0 pin 15 (irq 15)
atabus1 at piixide0 channel 1
ichsmb0 at pci0 dev 31 function 3: Intel 82801CA SMBus Controller (rev. 0x02)
pci_intr_map: no mapping for pin B (line=00)
ichsmb0: polling
iic0 at ichsmb0: I2C bus
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
attimer0 at isa0 port 0x40-0x43: AT Timer
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker (CPU-intensive output)
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
pcppi0: attached to attimer0
isapnp0: no ISA Plug 'n Play devices found
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
scsibus2: waiting 2 seconds for devices to settle...
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 0: <MATSHITA CR-177, , 7T0D> cdrom removable
sd0 at scsibus0 target 0 lun 0: <COMPAQ, LOGICAL VOLUME, 1.76> disk fixed
sd0: fabricating a geometry
sd0: 17535 MB, 17535 cyl, 64 head, 32 sec, 512 bytes/sect x 35912160 sectors
sd1 at scsibus0 target 1 lun 0: <COMPAQ, LOGICAL VOLUME, 1.76> disk fixed
fd0 at fdc0sd1: fabricating a geometry
 drive 0sd1: : 1.44MB, 80 cyl, 2 head, 18 sec
137 GB, 140297 cyl, 64 head, 32 sec, 512 bytes/sect x 287329920 sectors
Kernelized RAIDframe activated
pad0: Pseudo Audio Device
pad0: outputs: 44100Hz, 16-bit, stereo
audio0 at pad0: half duplex
sd0: fabricating a geometry
sd1: fabricating a geometry
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs


>How-To-Repeat:
	This appears to be somewhat hard to reproduce, as I guess it
	depends on the contents of the file system on my machine.
>Fix:
	Sorry, I have no idea, but will cooperate in debugging.