Subject: Re: kernel panic on 1.6_STABLE - solved in FreeBSD?
To: Frank van der Linden <fvdl@wasabisystems.com>
From: Pavel Cahyna <pcah8322@artax.karlin.mff.cuni.cz>
List: tech-kern
Date: 03/22/2003 15:14:15
> On Sat, Mar 01, 2003 at 12:37:00PM +0100, Pavel Cahyna wrote:
> > I can provide a traceback and additional information if necessary, but I
> > see this bug is probably described in FreeBSD's PR 15063 and fixed a
> > long time ago. Could someone knowing ffs import this fix, please?
> > (details here: http://www.freebsd.org/cgi/query-pr.cgi?pr=15063 )
> 
> The problem might be in this area, but that change can't be applied.
> In fact, that change was applied a while ago, but then changed
> (see rev 1.29 of ffs_balloc.c).
> 
> Since you are seeing this with a full filesystem, the problem is
> somewhere in there, probably, but this change is not applicable
> as is (anymore).

I have found an easy way to repeat this panic:

- configure a small vnd and mount it with softdeps.

- almost fill it, so there are only 13 blocks left.

(verify by typing dumpfs vnd0a: it should display

nbfree  13 )

- then type 
 dd if=/dev/zero of=mountpoint/foo bs=4096 (or what is the fs block
 size) count=13 

- it should crash.

I have a dump of this panic, with debugging symbols, so I can provide
more information on request.

here is the traceback:

Script started on Sat Mar 22 17:51:24 2003
root@k2:/root/crash# gdb netbsd.gdb
GNU gdb 5.0nb1
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386--netbsdelf"...
(gdb) bt
No stack.
(gdb) target kcore netbsd.3.core
panic: %s: indirect pointer #%d mismatch %d != %d
#0  0x1 in ?? ()
(gdb) bt
#0  0x1 in ?? ()
#1  0xc030dc33 in cpu_reboot (howto=260, bootstr=0x0)
    at /usr/src/sys/arch/i386/i386/machdep.c:2236
#2  0xc0220a6d in db_reboot_cmd () at /usr/src/sys/ddb/db_command.c:669
#3  0xc0220748 in db_command (last_cmdp=0xc0444014, cmd_table=0xc03df9ec)
    at /usr/src/sys/ddb/db_command.c:456
#4  0xc0220347 in db_command_loop () at /usr/src/sys/ddb/db_command.c:246
#5  0xc0223e20 in db_trap (type=1, code=0) at /usr/src/sys/ddb/db_trap.c:92
#6  0xc030aa5c in kdb_trap (type=1, code=0, regs=0xc5e71960)
    at /usr/src/sys/arch/i386/i386/db_interface.c:129
#7  0xc031682b in trap (frame={tf_gs = 16, tf_fs = 16, tf_es = -974716912,
      tf_ds = -1067319280, tf_edi = 256, tf_esi = -1069732512,
      tf_ebp = -974710368, tf_ebx = -974710324, tf_edx = -1069650306,
      tf_ecx = 23680, tf_eax = 3382, tf_trapno = 1, tf_err = 0,
      tf_eip = -1070552276, tf_cs = -1069678584, tf_eflags = 514,
      tf_esp = -974710336, tf_ss = -1071368267, tf_vm86_es = -974871616,
      tf_vm86_ds = 2040, tf_vm86_fs = -1019915560, tf_vm86_gs = -1071513584})
    at /usr/src/sys/arch/i386/i386/trap.c:220
#8  0xc0100e81 in calltrap ()
#9  0xc02437b5 in panic (
    fmt=0xc03d2d60 "%s: indirect pointer #%d mismatch %d != %d")
    at /usr/src/sys/kern/subr_prf.c:237
#10 0xc01fcbd7 in initiate_write_inodeblock (inodedep=0xc5d7b2f4,
---Type <return> to continue, or q <return> to quit---
    bp=0xc22ba668) at /usr/src/sys/ufs/ffs/ffs_softdep.c:3444
#11 0xc01fc7c2 in softdep_disk_io_initiation (bp=0xc22ba668)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:3270
#12 0xc026bf80 in spec_strategy (v=0xc5e71aa8)
    at /usr/src/sys/miscfs/specfs/spec_vnops.c:519
#13 0xc0266b8f in VOP_STRATEGY (bp=0xc22ba668)
    at /usr/src/sys/kern/vnode_if.c:102
#14 0xc025b8a3 in bwrite (bp=0xc22ba668) at /usr/src/sys/kern/vfs_bio.c:357
#15 0xc01f7527 in ffs_update (v=0xc5e71b5c)
    at /usr/src/sys/ufs/ffs/ffs_inode.c:148
#16 0xc0267667 in VOP_UPDATE (vp=0xc5e0320c, access=0x0, modify=0x0, flags=1)
    at /usr/src/sys/kern/vnode_if.c:1498
#17 0xc01f6c3b in ffs_balloc (v=0xc5e71c90)
    at /usr/src/sys/ufs/ffs/ffs_balloc.c:470
#18 0xc0267563 in VOP_BALLOC (vp=0xc5e0320c, startoffset=49152, size=4096,
    cred=0xc06c4500, flags=0, bpp=0x0) at /usr/src/sys/kern/vnode_if.c:1370
#19 0xc01f6e0b in ffs_gop_alloc (vp=0xc5e0320c, off=49152, len=4096, flags=0,
    cred=0xc06c4500) at /usr/src/sys/ufs/ffs/ffs_balloc.c:530
#20 0xc0202635 in ffs_write (v=0xc5e71e4c)
    at /usr/src/sys/ufs/ufs/ufs_readwrite.c:338
#21 0xc0266e3b in VOP_WRITE (vp=0xc5e0320c, uio=0xc5e71ee0, ioflag=1,
    cred=0xc06c4500) at /usr/src/sys/kern/vnode_if.c:458
#22 0xc0266793 in vn_write (fp=0xc5cda680, offset=0xc5cda6a8, uio=0xc5e71ee0,
---Type <return> to continue, or q <return> to quit---
    cred=0xc06c4500, flags=1) at /usr/src/sys/kern/vfs_vnops.c:434
#23 0xc02475bb in dofilewrite (p=0xc5e661d4, fd=4, fp=0xc5cda680,
    buf=0x8062000, nbyte=<error type>, offset=0xc5cda6a8, flags=1,
    retval=0xc5e71f78) at /usr/src/sys/kern/sys_generic.c:346
#24 0xc0247517 in sys_write (p=0xc5e661d4, v=0xc5e71f80, retval=0xc5e71f78)
    at /usr/src/sys/kern/sys_generic.c:303
#25 0xc0316383 in syscall_plain (frame={tf_gs = 31, tf_fs = 31, tf_es = 31,
      tf_ds = 31, tf_edi = -1077946608, tf_esi = 134619136,
      tf_ebp = -1077946592, tf_ebx = 4096, tf_edx = 0, tf_ecx = 0, tf_eax = 4,
      tf_trapno = 3, tf_err = 2, tf_eip = 134580291, tf_cs = 23,
      tf_eflags = 663, tf_esp = -1077946668, tf_ss = 31, tf_vm86_es = 0,
      tf_vm86_ds = 0, tf_vm86_fs = 0, tf_vm86_gs = 0})
    at /usr/src/sys/arch/i386/i386/syscall.c:140
#26 0xc0100f4e in syscall1 ()
can not access 0xbfbfd720, invalid translation (invalid PDE)
can not access 0xbfbfd720, invalid translation (invalid PDE)
Cannot access memory at address 0xbfbfd720
(gdb) msgbuf
msgbufp 0xc22a5000: bufx 3556 bufr 3310 bufs 8176
Dumping 0xc22a5df4 length 4620

Dumping 0xc22a5010 length 3556
NetBSD 1.6_STABLE (EISA-DEBUG: ep bez resetu, DEBUG, DIAGNOSTIC, odesilani multicastu) #4: Mon Feb 10 22:06:29 CET 2003
    build@omega:/obj/kernobjdir/i386/EISA-DEBUG
cpu0: Intel 486DX (486-class)
total memory = 50044 KB
avail memory = 42148 KB
using 651 buffers containing 2604 KB of memory
mainbus0 (root)
eisa0 at mainbus0
ahb0 at eisa0 slot 1: Adaptec AHA-1740A SCSI
ahb0: interrupting at irq 10
scsibus0 at ahb0: 8 targets, 8 luns per target
unknown 3Com device TCM6790 at eisa0 slot 3 not configured
ahb1 at eisa0 slot 8: Adaptec AHA-1740A SCSI
ahb1: interrupting at irq 10
scsibus1 at ahb1: 8 targets, 8 luns per target
ep0 at eisa0 slot 10: 3Com 3C579 Ethernet
ep0: interrupting at irq 15
ep0: address 00:20:af:2c:2f:68, 8KB byte-wide FIFO, 5:3 Rx:Tx split
---Type <return> to continue, or q <return> to quit---
ep0: 10base5, 10base2 (default 10base2)
isa0 at mainbus0
tr0 at isa0 port 0xa20-0xa23 iomem 0xd8000-0xdbfff irq 7
tr0: address 00:60:8c:23:a7:50 ring speed 16 Mbps
com0 at isa0 port 0x3f8-0x3ff irq 4: ns8250 or ns16450, no fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns8250 or ns16450, no fifo
pckbc0 at isa0 port 0x60-0x64
pckbd: error setting scanset 2
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
pmsprobe: reset error 5
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 channel 0 drive 0: <QUANTUM FIREBALL540A>
wd0: drive supports 8-sector PIO transfers, LBA addressing
wd0: 519 MB, 1056 cyl, 16 head, 63 sec, 512 bytes/sect x 1064448 sectors
wd0: drive supports PIO mode 4, DMA mode 2
vga0 at isa0 port 0x3b0-0x3df iomem 0xa0000-0xbffff
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation), using wskbd0
wsmux1: connecting to wsdisplay0
lptprobe: mask ff data 55 failed
lpt1 at isa0 port 0x278-0x27b irq 5
lptprobe: mask ff data 55 failed
---Type <return> to continue, or q <return> to quit---
seaprobe: board type unknown at address 0xc0523000
pcppi0 at isa0 port 0x61
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
isapnp0: no ISA Plug 'n Play devices found
biomask 7f45 netmask ffc5 ttymask ffe7
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <CONNER, CFP1060S 1.05GB, 203C> SCSI2 0/direct fixed
sd0: 1013 MB, 2756 cyl, 8 head, 94 sec, 512 bytes/sect x 2074880 sectors
scsibus1: waiting 2 seconds for devices to settle...
sd1 at scsibus1 target 6 lun 0: <CONNER, CFP1060S 1.05GB, 203C> SCSI2 0/direct fixed
sd1: 1013 MB, 2756 cyl, 8 head, 94 sec, 512 bytes/sect x 2074880 sectors
raidattach: Asked for 8 units
Kernelized RAIDframe activated
Searching for raid components...
wd0: no disk label
IPsec: Initialized Security Association Processing.
wd0: no disk label
findroot: can't open dev wd0a (6)
boot device: sd0
---Type <return> to continue, or q <return> to quit---
root on sd0a dumps on sd0b
mountroot: trying coda...
mountroot: trying msdos...
mountroot: trying cd9660...
mountroot: trying ntfs...
mountroot: trying nfs...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
init: copying out path `/sbin/init' 11
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
vnd0: no disk label
vnd0: no disk label
<3>uid 0 comm dd on /mnt/pokus: file system full
panic: softdep_write_inodeblock: indirect pointer #0 mismatch 0 != 2040

dumping to dev 4,1 offset 169991
dump 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
---Type <return> to continue, or q <return> to quit---

(gdb) root@k2:/root/crash#
Script done on Sat Mar 22 17:53:57 2003

Bye	Pavel