Subject: port-sparc/3575: panic: pv_unlink0 on sun4m (SPARC LX)
To: None <gnats-bugs@gnats.netbsd.org>
From: None <fair@cesium.clock.org>
List: netbsd-bugs
Date: 05/05/1997 02:47:46
>Number: 3575
>Category: port-sparc
>Synopsis: panic: pv_unlink0 should not happen
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: gnats-admin (GNATS administrator)
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Mon May 5 02:50:02 1997
>Last-Modified:
>Originator: Erik E. Fair
>Organization:
International Organization of Internet Clock Watchers (CLOCK-DOM)
>Release: NetBSD-current Apr 29, 1997 (and later)
>Environment:
System: NetBSD cesium.clock.org 1.2D NetBSD 1.2D (CESIUM) #3: Tue Apr 29 16:30:25 PDT 1997 root@:/usr/src/sys/arch/sparc/compile/CESIUM sparc
Sun SPARC LX (sun4m), 96MB RAM
headless (no keyboard, mouse, or monitor - DEC VT220 9600b serial console on ttya)
one Sun FSBE/S (X1053A, 501-2015) Fast SCSI-II, Buffered Ethernet I/F in Sbus Slot 0
bootpath: /iommu@0,10000000/sbus@0,10001000/dma@0,81000/esp@0,80000/sd@0,0
sd0 at scsibus0 targ 3 lun 0: <QUANTUM, FIREBALL1080S, 1Q09> SCSI2 0/direct fixed
sd0: 1042MB, 3835 cyl, 4 head, 139 sec, 512 bytes/sec
sd1 at scsibus1 targ 0 lun 0: <DEC, DSP5350S, 427B> SCSI2 0/direct fixed
sd1: 3406MB, 3055 cyl, 25 head, 91 sec, 512 bytes/sec
sd3 at scsibus1 targ 1 lun 0: <DEC, DSP5350S, 427B> SCSI2 0/direct fixed
sd3: 3406MB, 3055 cyl, 25 head, 91 sec, 512 bytes/sec
swap on sd1b and sd3b (96MB, each)
network connection on UTP on onboard le0 [140.174.97.8]
(full kernel config, and autoconfig output available)
>Description:
NetBSD-current panics with "pv_unlink0" under some load.
Both GENERIC and specific (CESIUM) kernels panic, typically
near the end of the /etc/rc script, before the console
login prompt is printed.
Not all kernels have produced viable crash dumps; best one to date
resulted in the following output:
fair@cesium 3} gdb -k /sys/arch/sparc/compile/CESIUM/netbsd.gdb netbsd.7.core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.11 (sparc-netbsd), Copyright 1993 Free Software Foundation, Inc...
panic: pv_unlink0
#0 0xf80f66c4 in dumpsys () at ../../../../arch/sparc/sparc/machdep.c:766
766 snapshot(cpcb);
(kgdb) where
#0 0xf80f66c4 in dumpsys () at ../../../../arch/sparc/sparc/machdep.c:766
#1 0xf80f6478 in cpu_reboot (howto=256, user_boot_string=0x0) at ../../../../arch/sparc/sparc/machdep.c:676
#2 0xf802b3c4 in panic (fmt=0x0) at ../../../../kern/subr_prf.c:149
#3 0xf80f89a4 in pv_unlink4m (pv=0xf81cb080, pm=0xf81306b8, va=4170711040) at ../../../../arch/sparc/sparc/pmap.c:2356
#4 0xf80fb974 in pmap_enk4m (pm=0xf81306b8, va=4170711040, prot=7, wired=-132657152, pv=0xf81ba680, pteproto=4106398) at ../../../../arch/sparc/sparc/pmap.c:5375
#5 0xf80fb7e4 in pmap_enter4m (pm=0xf81306b8, va=4170711040, pa=65699840, prot=7, wired=1) at ../../../../arch/sparc/sparc/pmap.c:5315
#6 0xf80cd57c in vm_fault (map=0xf81dc108, vaddr=4170711040, fault_type=7, change_wiring=1) at ../../../../vm/vm_fault.c:826
#7 0xf80cd72c in vm_fault_wire (map=0xf81dc108, start=4170682368, end=4170715136) at ../../../../vm/vm_fault.c:884
#8 0xf80cfb40 in vm_map_pageable (map=0xf81dc108, start=4170682368, end=4170715136, new_pageable=0) at ../../../../vm/vm_map.c:1337
#9 0xf80ce680 in kmem_malloc (map=0xf81dc108, size=32768, canwait=0) at ../../../../vm/vm_kern.c:321
#10 0xf802075c in malloc (size=32768, type=84, flags=0) at ../../../../kern/kern_malloc.c:145
#11 0xf810e7c0 in sunos_sys_getdents (p=0xf8960000, v=0xfc71af28, retval=0xfc71af20) at ../../../../compat/sunos/sunos_misc.c:456
#12 0xf80fe9f0 in syscall (code=174, tf=0xfc71afb0, pc=268487316) at ../../../../arch/sparc/sparc/trap.c:1100
(kgdb)
proximate causes read from good crash dumps have also
included "sys_read" (and then down teh same stack as above).
Varying the amount of system RAM (attempts were made to
run multiuser at 64M and 32M) only varied the amount of
time before panic, generally increasing it by an hour.
a DEBUG kernel with the PDB_SANITYCHK bit set in pmapdebug
resulted in the same panic, with no crash dump.
By sheer dumb luck, I seem to have a DEBUG kernel that is
stable (for some value of "stable"; it too crashes under
load, but after hours or days). Alas, stupidity on my part
caused the debug symbols for this "stable" kernel to be
lost in a subsequent kernel build.
Attempts to use specific kernels with various device drivers
removed (principally the extraneous graphics devices)
generally result in kernels that panic immediately, rather
than in more stability; this suggests a link to kernel size:
small kernels panic immediately.
>How-To-Repeat:
Boot my system with any kernel other than the one listed above in "System:"
>Fix:
Got me - this looks like an honest-to-god kernel bug; not
clear whether it's in the pmap code, or in the MI VM code.
For now, submitting this with a port-sparc category, pending
evidence that other ports have this problem.
>Audit-Trail:
>Unformatted: