Subject: Re: Getting "TLB IPI rendezvous failed..."
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Frederick Bruckman <fredb@immanent.net>
List: tech-kern
Date: 12/23/2004 00:56:26
On Wed, 22 Dec 2004, Manuel Bouyer wrote:
>
> I see similar panics, see kern/28541. You could try to see where the others
> CPU are with 'mach cpu #' followed by 't', to make sure mi_switch is
> also involved in the panic in your case.

Hmm, got a different one...

l->l_cpu != curcpu() failed, file .../uvm_glue.c line 605
db{6} t
__assert
uvm_swapout
uvmpd_scan
uvm_pageout
db{6} machine cpu 0
db{6} t
acquire
spinlock_acquire_count
mi_switch
ltsleep
sbwait
soreceive
[more nfs stuff]

I should add that the kernel's built with "-momit-leaf-frame-pointer", 
which is probably why part of the call chain appears to be missing.

So...

1) It's only on i386?

2) The general pattern seems to be that one cpu is at spipl(), waiting 
for a lock, while the other cpu insists on doing something to the first 
cpu, and has no way to back off? I wonder why it's only i386.

Another thing I should mention, when I had kernel without options 
DEBUG, DIAGNOSTIC, or DDB_ONPANIC=1, it would just seem to freeze, but 
once, on an unattended freeze, it seemed to resolve all by itself after 
a few hours (in a reboot).


Frederick