Subject: kern/27597: 2.0: Scheduler Activations related panic
To: None <gnats-bugs@gnats.netbsd.org>
From: Hubert Feyrer <feyrer@rfhpc8323.fh-regensburg.de>
List: netbsd-bugs
Date: 10/28/2004 12:16:56
>Number:         27597
>Category:       kern
>Synopsis:       2.0: Scheduler Activations related panic
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Oct 28 10:18:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Hubert Feyrer
>Release:        NetBSD 2.0_RC4
>Organization:
Hubert Feyrer <hubertf@channel.regensburg.org>
>Environment:
	
	
System: NetBSD miyu 2.0_RC2 NetBSD 2.0_RC2 (MIYU) #19: Sat Oct  2 00:58:46 MEST 2004  feyrer@miyu:/home/cvs/src-2.0/sys/arch/i386/compile/obj.i386/MIYU i386
Architecture: i386
Machine: i386
>Description:
	For a few days, I've seen several complete halts of the system while
	under little load, with X freezing, and not even (blindly typed)
	ctl+alt+esc followed by "r" helped getting out.

	Leaving the system sit on the console for some time, I saw
	a panic about a failed assertion in line 1044 of kern_sa.c
	(sorry, no verbatim output), and I was able to force a crashdump
	via ctl+alt+esc.  The crashdump says:

	(gdb) target kcore netbsd.7.core
	panic: kernel %sassertion "%s" failed: file "%s", line %d
	#0  0x0fef0000 in ?? ()
	(gdb) 


	Line 1044 of kern_sa.c says (rev. 1.50.1.2):
		KDASSERT(vp->savp_lwp == l2); 
	(which is also what was in the panic message).


	Stack trace just shows action from after ctl+alt+esc:

	(gdb) bt
	#0  0x0fef0000 in ?? ()
	#1  0xc02bb6e7 in cpu_reboot ()
	#2  0xc0201339 in db_reboot_cmd ()
	#3  0xc0200e7f in db_command ()
	#4  0xc0200b92 in db_command_loop ()
	#5  0xc0203c5c in db_trap ()
	#6  0xc02b8e52 in kdb_trap ()
	#7  0xc02c5ff4 in trap ()
	#8  0xc0102e01 in calltrap ()
	#9  0xc0307b90 in internal_command ()
	#10 0xc0307db5 in wskbd_translate ()
	#11 0xc0306c1f in wskbd_input ()
	#12 0xc030bcfb in pckbd_input ()
	#13 0xc030b6d2 in pckbportintr ()
	#14 0xc0170614 in pckbcintr ()


>How-To-Repeat:
	Run fvwm, gqmpeg, a system or pkgsrc build in the background,
	TeX - and see the system freeze all of a sudden.

>Fix:
	Aparently the condition can arise that vp->sav_lwp is not equal to
	l2. Dunno where to go from there.
>Release-Note:
>Audit-Trail:
>Unformatted: