Subject: kern/25285: i386 MP panic: TLB IPI rendezvous failed (mask 1)
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <nathanw@mit.edu>
List: netbsd-bugs
Date: 04/22/2004 17:18:58
>Number:         25285
>Category:       kern
>Synopsis:       i386 MP panic: TLB IPI rendezvous failed (mask 1)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Apr 22 21:41:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Nathan J. Williams
>Release:        NetBSD 2.0C
>Organization:
	Massachvestts Institvte of Technology
>Environment:
	
	
System: NetBSD marvin-the-martian.nathanw.com 2.0C NetBSD 2.0C (MARVIN) #74: Tue Apr 20 14:57:53 EDT 2004 nathanw@marvin-the-martian.nathanw.com:/nbsd/src/sys/arch/i386/compile/MARVIN i386
Architecture: i386
Machine: i386
>Description:

After upgrading my desktop box (dual athlon MP 2000+) to 2.0C, I
decided to give MULTIPROCESSOR a spin on it. Under slight load
(compiling another kernel, no -j option) with a MULTIPROCESSOR
kernel, I got:

panic: TLB IPI rendezvous failed (mask 1)

Stopped in pid 8360.1 (cc1) at  netbsd:cpu_Debugger+0x4: leave
db{1}> t
cpu_Debugger()
panic()
pmap_tlb_shootnow(3,cc3c1000,61c016c,ce5ebcc0,c07794a0) at pmap_tlb_shootnow+0x108
pmap_kremove(cc3c0000,2000,21c,ce5ebd18,c0782740) at pmap_kremove+0x56
ubc_release(cc3c0000,0,0,0,d) at ubc_release+0x1ab
ffs_write(ce5ebe24,40855555,ce5ebe5c,c0277094,c0381820) at ffs_write+0x41b
VOP_WRITE(cec0a6f0,ce5ebec4,1,c1e14380,cec0a6f0) at VOP_WRITE+0x34
vn_write(cde07d24,cde07d4c,cd5ebec4,c1e14380,1) at vn_write+0xbf
dofilewrite(ce36c018,3,cde07d24,83ca000,2000) at dofilewrite+0x86
sys_write(cde059d0,cd5ebf64,ce5ebf5c,30,c040a420) at sys_write+0x70
syscall_plain() at syscall_plain+0x182
--- syscall (number 4) ---

db{1}> mach cpu 0
db{1}> t

netbsd:cpu_switch+0xda:

It is quickly repeatable, though not totally deterministic. On a
second occasion the trace was:

panic: TLB IPI rendezvous failed (mask 1)
Stopped in pid 9531.1 (cc) at netbsd:cpu_Debugger_0x4: leave
db{1}> t
cpu_Debugger()
panic()
pmap_tlb_shootnow(3,ce2a8cdc,0,25bf063,c1bb0c00) at +0x108
pmap_do_remove(c0453320,cb6e8000,cb728000,0,cb728000) at +0xc1
pmap_remove(c0453320,cb6e8000,cb728000,cb728000,253b) at +0x15
uvm_unmap_remove(c1bb0c00,cb6e8000,cb728000,ce2a8d7c,cdcb0c8c) at +0x27c
uvm_km_free_wakeup(c1bb0c00,cb6e8000,40000,cdcb0c8c,0) at +0xc9
sys_execve(cdc2294c,ce2a8f64,ce2a8f5c,2c4,c03a842c) at +0x8e6
syscall_plain() at +0x182
--- syscall (number 59) ---

db{1}> mach cpu 0
db{1}> t

acquire(c044cf60,cc01ef10,400040,0,600) at +0x5f
_lockmgr(c044cf60,400042,0,c03dc3a0,315) at +0x4bd
x86_softintlock(cdc20010,30,c03a0010,c0400010,cc01b000) at +0x21

>How-To-Repeat:

"Run on this system". I was unable to reproduce the problem on
my Dell dual-Pentium 4 system.

>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: