Subject: Re: Getting "TLB IPI rendezvous failed..."
To: Frederick Bruckman <fredb@immanent.net>
From: Frank van der Linden <fvdl@netbsd.org>
List: tech-kern
Date: 12/23/2004 13:50:12
On Thu, Dec 23, 2004 at 12:56:26AM -0600, Frederick Bruckman wrote:
> 2) The general pattern seems to be that one cpu is at spipl(), waiting 
> for a lock, while the other cpu insists on doing something to the first 
> cpu, and has no way to back off? I wonder why it's only i386.

That's the general deadlock pattern: one CPU is at a very high spl
(splipi, which is the highest possible), waiting to acquire a lock. Another
CPU holds the lock, and has to do something which involves sending an IPI
and waiting for the other CPUs to receive it. But, the first CPU never
gets it.

I don't know why this problem has resurfaced recently for some people.

Manuel is right, collecting the traces is the most important thing, it
will show where the CPUs get stuck.

- Frank