Subject: Re: Getting "TLB IPI rendezvous failed..."
To: Manuel Bouyer <firstname.lastname@example.org>
From: Stephan Uphoff <email@example.com>
Date: 01/21/2005 11:03:50
I have a few ideas.
Hopefully I will be able to send you a test patch over the weekend.
Can you send me your dmesg?
On Fri, 2005-01-21 at 05:53, Manuel Bouyer wrote:
> On Thu, Jan 20, 2005 at 01:40:48PM +0100, Manuel Bouyer wrote:
> > Here it is. Still pipe related, but what the second CPU was doinG at the
> > same time is interesting:
> > CPU 1 (the one that paniced):
> > panic()
> > pmap_tlb_shootnow()
> > pamp_kremove()
> > pipe_direct_write()
> > pipe_write()
> > ...
> > CPU 0:
> > _kenrel_lock()
> > intr_biglock_wrapper()
> > Xintr_ioapic_edge15()
> > Xspllower()
> > _kernel_lock()
> > x86_softintrlock()
> > Xsoftclock()
> > I just noticed that I didn't have lockdebug enabled in this kernel :(
> > I'll install a new one for the next panic.
> LOCKDEBUG didn't bring anything more.
> The new panic I got tonight:
> CPU 1:
> CPU 0:
> A few things to notice:
> - it seems it's always CPU1 which panics, and cpu0 which holds the lock
> - even though pipe didn't appear in this trace, it's still related to
> amanda backups, which makes an heavy use of pipes
> - again it had about 500M free RAM when it paniced
> - cpu0 seems to always come from a soft clock interrupt
> - the recent changes to protect IPIs with splclock() cause the traces to
> be different. With 2.0, CPU 0 was stuck with a tsleep()/mi_switch()
> in the path.
> Anything else I can try to help debug this ?