Subject: port-i386/7512: Data modified on freelist and uvm_fault crash
To: None <gnats-bugs@gnats.netbsd.org>
From: Dave Huang <khym@bga.com>
List: netbsd-bugs
Date: 05/02/1999 22:50:59
>Number:         7512
>Category:       port-i386
>Synopsis:       i386 crashes with data modified on freelist msgs then uvm_fault
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-i386-maintainer (NetBSD/i386 Portmaster)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun May  2 22:50:01 1999
>Last-Modified:
>Originator:     Dave Huang
>Organization:
Name: Dave Huang     |   Mammal, mammal / their names are called /
INet: khym@bga.com   |   they raise a paw / the bat, the cat /
FurryMUCK: Dahan     |   dolphin and dog / koala bear and hog -- TMBG
Dahan: Hani G Y+C 23 Y++ L+++ W- C++ T++ A+ E+ S++ V++ F- Q+++ P+ B+ PA+ PL++
>Release:        NetBSD-1.4_BETA as of April 24
>Environment:
System: NetBSD sloth.metonymy.com 1.4_BETA NetBSD 1.4_BETA (SLOTH) #193: Sun Apr 25 05:13:21 CDT 1999 khym@dahan.metonymy.com:/usr/src.local/sys/arch/i386/compile/SLOTH i386


>Description:
	My 386/33 with 8MB of RAM has crashed twice so far with what seems
to be the same error... I use it as my internet router and NAT machine,
and both times before it crashed, I noticed my internet connection wasn't
coming up. I'd then look at /var/log/messages and see stuff like this:

Apr 29 15:08:12 sloth pppd[80]: error waiting for (dis)connection process: No child processes
May  3 00:05:58 sloth pppd[80]: error waiting for (dis)connection process: No child processes

The machine crashed a few seconds later. The first time it dumped core and
rebooted, the second it just locked up. Here's a dmesg from the first crash's
dump:

NetBSD 1.4_BETA (SLOTH) #193: Sun Apr 25 05:13:21 CDT 1999
    khym@dahan.metonymy.com:/usr/src.local/sys/arch/i386/compile/SLOTH
cpu0: Intel 386DX (386-class)
real mem  = 7995392
avail mem = 6098944
using 123 buffers containing 503808 bytes of memory
mainbus0 (root)
isa0 at mainbus0
ne0 at isa0 port 0x300-0x31f irq 10
ne0: NE2000 Ethernet
ne0: Ethernet address 00:40:05:62:d5:28
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 channel 0 drive 0: <QUANTUM LP120A GM120A01X>
wd0: drive supports 8-sector pio transfers, chs addressing
wd0: 116MB, 901 cyl, 5 head, 53 sec, 512 bytes/sect x 238765 sectors
tcom0 at isa0 port 0x100-0x13f irq 11
com2 at tcom0 slave 0: st16650a, working fifo
com3 at tcom0 slave 1: st16650a, working fifo
com4 at tcom0 slave 2: st16650a, working fifo
com5 at tcom0 slave 3: st16650a, working fifo
com at tcom0 slave 4 not configured
com at tcom0 slave 5 not configured
com at tcom0 slave 6 not configured
com at tcom0 slave 7 not configured
lpt0 at isa0 port 0x378-0x37b irq 7
pc0 at isa0 port 0x60-0x6f irq 1: color
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
biomask 4040 netmask 4440 ttymask 44c2
boot device: wd0
root on wd0a dumps on wd0b
IP Filter: initialized.  Default = pass all, Logging = enabled
uid 500 on /usr: file system full
uid 500 on /usr: file system full
uid 500 on /usr: file system full
uid 500 on /usr: file system full
Data modified on freelist: word 0 of object 0xf0324b80 size 68 previous type temp (0xdeadbe88 != 0xdeadbeef)
Data modified on freelist: word 0 of object 0xf0324900 size 68 previous type temp (0xdeadbe88 != 0xdeadbeef)
Data modified on freelist: word 0 of object 0xf0322c00 size 128 previous type temp (0xdeadbe86 != 0xdeadbeef)
Data modified on freelist: word 0 of object 0xf0324000 size 128 previous type temp (0xdeadbe86 != 0xdeadbeef)
Data modified on freelist: word 0 of object 0xf0324700 size 128 previous type temp (0xdeadbe86 != 0xdeadbeef)
Data modified on freelist: word 0 of object 0xf0322c80 size 120 previous type temp (0xdeadbe86 != 0xdeadbeef)
uvm_fault(0xf1b4e424, 0x0, 0, 1) -> 1
fatal page fault in supervisor mode
trap type 6 code 0 eip f01b4678 cs f01b0008 eflags 10206 cr2 1d cpl 0
panic: trap
syncing disks... 9 9 8 6 done

gdb didn't give me a useful stack trace:
#0  0xf01cf13e in sys_sysarch () at ../../../../arch/i386/i386/trap.c:250
#1  0xf01c609f in cpu_reboot (howto=0x100, bootstr=0x0)
    at ../../../../arch/i386/i386/machdep.c:1350
#2  0xf011c0a8 in log (level=0xf01cf13e, fmt=0x0)
    at ../../../../kern/subr_prf.c:212
#3  0xf01cf39d in trap (frame={tf_es = 0x10, tf_ds = 0xf0290010, tf_edi = 0x0,
      tf_esi = 0xf1b912d4, tf_ebp = 0xf1b83d5c, tf_ebx = 0x0,
      tf_edx = 0xfffffffa, tf_ecx = 0x4c, tf_eax = 0x1d, tf_trapno = 0x6,
      tf_err = 0x0, tf_eip = 0xf01b4678, tf_cs = 0xf01b0008,
      tf_eflags = 0x10206, tf_esp = 0xf1b912d4, tf_ss = 0xf1b96818,
      tf_vm86_es = 0xf1b83d74, tf_vm86_ds = 0xf01b3fa6,
      tf_vm86_fs = 0xf1b912d4, tf_vm86_gs = 0xf1b96818})
    at ../../../../arch/i386/i386/trap.c:310

following the stack by hand, I get:
0xf01b4678 is in amap_wipeout (../../../../uvm/uvm_amap.c:519).
0xf01b3fa6 is in amap_unref (../../../../uvm/uvm_amap_i.h:260).
0xf01ba061 is in uvm_unmap_detach (../../../../uvm/uvm_map.c:1114).
0xf01b93dd is in uvm_unmap (../../../../uvm/uvm_map_i.h:169).
0xf01bb3d3 is in uvmspace_exec (../../../../uvm/uvm_map.c:2362).
0xf010e1c8 is in sys_execve (../../../../kern/kern_exec.c:370).
0xf01cf94c is in syscall (../../../../arch/i386/i386/trap.c:782).

The kernel was compiled with optimization, so those line numbers are a bit
off...

>How-To-Repeat:
	Don't know... seems to happen every once in a while.
>Fix:
	Don't know that either.
>Audit-Trail:
>Unformatted: