Subject: port-sparc/3575: panic: pv_unlink0 on sun4m (SPARC LX)
To: None <gnats-bugs@gnats.netbsd.org>
From: None <fair@cesium.clock.org>
List: netbsd-bugs
Date: 05/05/1997 02:47:46
>Number:         3575
>Category:       port-sparc
>Synopsis:       panic: pv_unlink0 should not happen
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May  5 02:50:02 1997
>Last-Modified:
>Originator:     Erik E. Fair
>Organization:
International Organization of Internet Clock Watchers (CLOCK-DOM)

>Release:        NetBSD-current Apr 29, 1997 (and later)
>Environment:

System: NetBSD cesium.clock.org 1.2D NetBSD 1.2D (CESIUM) #3: Tue Apr 29 16:30:25 PDT 1997 root@:/usr/src/sys/arch/sparc/compile/CESIUM sparc

	Sun SPARC LX (sun4m), 96MB RAM
	headless (no keyboard, mouse, or monitor - DEC VT220 9600b serial console on ttya)
	one Sun FSBE/S (X1053A, 501-2015) Fast SCSI-II, Buffered Ethernet I/F in Sbus Slot 0

	bootpath: /iommu@0,10000000/sbus@0,10001000/dma@0,81000/esp@0,80000/sd@0,0
	sd0 at scsibus0 targ 3 lun 0: <QUANTUM, FIREBALL1080S, 1Q09> SCSI2 0/direct fixed
	sd0: 1042MB, 3835 cyl, 4 head, 139 sec, 512 bytes/sec
	sd1 at scsibus1 targ 0 lun 0: <DEC, DSP5350S, 427B> SCSI2 0/direct fixed
	sd1: 3406MB, 3055 cyl, 25 head, 91 sec, 512 bytes/sec
	sd3 at scsibus1 targ 1 lun 0: <DEC, DSP5350S, 427B> SCSI2 0/direct fixed
	sd3: 3406MB, 3055 cyl, 25 head, 91 sec, 512 bytes/sec

	swap on sd1b and sd3b (96MB, each)
	network connection on UTP on onboard le0 [140.174.97.8]

	(full kernel config, and autoconfig output available)

>Description:
	NetBSD-current panics with "pv_unlink0" under some load.

	Both GENERIC and specific (CESIUM) kernels panic, typically
	near the end of the /etc/rc script, before the console
	login prompt is printed.

	Not all kernels have produced viable crash dumps; best one to date
	resulted in the following output:

fair@cesium 3} gdb -k /sys/arch/sparc/compile/CESIUM/netbsd.gdb netbsd.7.core
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
 There is absolutely no warranty for GDB; type "show warranty" for details.
 GDB 4.11 (sparc-netbsd), Copyright 1993 Free Software Foundation, Inc...
 panic: pv_unlink0
 #0  0xf80f66c4 in dumpsys () at ../../../../arch/sparc/sparc/machdep.c:766
 766             snapshot(cpcb);
 (kgdb) where
 #0  0xf80f66c4 in dumpsys () at ../../../../arch/sparc/sparc/machdep.c:766
 #1  0xf80f6478 in cpu_reboot (howto=256, user_boot_string=0x0) at ../../../../arch/sparc/sparc/machdep.c:676
 #2  0xf802b3c4 in panic (fmt=0x0) at ../../../../kern/subr_prf.c:149
 #3  0xf80f89a4 in pv_unlink4m (pv=0xf81cb080, pm=0xf81306b8, va=4170711040) at ../../../../arch/sparc/sparc/pmap.c:2356
 #4  0xf80fb974 in pmap_enk4m (pm=0xf81306b8, va=4170711040, prot=7, wired=-132657152, pv=0xf81ba680, pteproto=4106398) at ../../../../arch/sparc/sparc/pmap.c:5375
 #5  0xf80fb7e4 in pmap_enter4m (pm=0xf81306b8, va=4170711040, pa=65699840, prot=7, wired=1) at ../../../../arch/sparc/sparc/pmap.c:5315
 #6  0xf80cd57c in vm_fault (map=0xf81dc108, vaddr=4170711040, fault_type=7, change_wiring=1) at ../../../../vm/vm_fault.c:826
 #7  0xf80cd72c in vm_fault_wire (map=0xf81dc108, start=4170682368, end=4170715136) at ../../../../vm/vm_fault.c:884
 #8  0xf80cfb40 in vm_map_pageable (map=0xf81dc108, start=4170682368, end=4170715136, new_pageable=0) at ../../../../vm/vm_map.c:1337
 #9  0xf80ce680 in kmem_malloc (map=0xf81dc108, size=32768, canwait=0) at ../../../../vm/vm_kern.c:321
 #10 0xf802075c in malloc (size=32768, type=84, flags=0) at ../../../../kern/kern_malloc.c:145
 #11 0xf810e7c0 in sunos_sys_getdents (p=0xf8960000, v=0xfc71af28, retval=0xfc71af20) at ../../../../compat/sunos/sunos_misc.c:456
 #12 0xf80fe9f0 in syscall (code=174, tf=0xfc71afb0, pc=268487316) at ../../../../arch/sparc/sparc/trap.c:1100
 (kgdb) 

	proximate causes read from good crash dumps have also
	included "sys_read" (and then down teh same stack as above).

	Varying the amount of system RAM (attempts were made to
	run multiuser at 64M and 32M) only varied the amount of
	time before panic, generally increasing it by an hour.

	a DEBUG kernel with the PDB_SANITYCHK bit set in pmapdebug
	resulted in the same panic, with no crash dump.

	By sheer dumb luck, I seem to have a DEBUG kernel that is
	stable (for some value of "stable"; it too crashes under
	load, but after hours or days). Alas, stupidity on my part
	caused the debug symbols for this "stable" kernel to be
	lost in a subsequent kernel build.

	Attempts to use specific kernels with various device drivers
	removed (principally the extraneous graphics devices)
	generally result in kernels that panic immediately, rather
	than in more stability; this suggests a link to kernel size:
	small kernels panic immediately.


	
>How-To-Repeat:
	Boot my system with any kernel other than the one listed above in "System:"

>Fix:
	Got me - this looks like an honest-to-god kernel bug; not
	clear whether it's in the pmap code, or in the MI VM code.
	For now, submitting this with a port-sparc category, pending
	evidence that other ports have this problem.
>Audit-Trail:
>Unformatted: