Subject: Re: Getting "TLB IPI rendezvous failed..."
To: Manuel Bouyer <firstname.lastname@example.org>
From: Stephan Uphoff <email@example.com>
Date: 01/15/2005 11:36:59
On Sat, 2005-01-15 at 10:05, Manuel Bouyer wrote:
> On Thu, Jan 13, 2005 at 01:16:26AM +0100, Frank van der Linden wrote:
> > On Tue, Jan 11, 2005 at 11:44:33PM -0500, Stephan Uphoff wrote:
> > > You can also just add the splclock()/splx in x86_ipi as there is no
> > > need to protect the atomic bitmaps.
> > Ayup. Many thanks for the suggestions, I committed that change.
> > Can the people who had these problems (Fred, Havard?) see if this makes
> > any change? I tested if the changes work on one of my SMP systems, but
> > I could never reproduce the bug itself on those in the first place.
> I backported these changes to a netbsd-2-0-RELEASE kernel. It didn't help for
> It paniced again while the amanda client was running.
> If you think that a current kernel has additionnal fixes that may be relevant,
Mhhh .. is see a change in the i386 spl logic:
> Updaing ci_ilevel and testing ci_ipending must be done with all
> off, or priority inversion can occur, which can lead to IPI deadlocks.
> Leaves interrupts off for a bit longer, sadly, but with no noticeable
> effects on the systems I tested on.
> From YAMAMOTO Takashi.
That did not make it to the 2-0-RELEASE.
> I can try a current kernel.
This would be helpful.
> Also, I also have a dual-CPU sparc10 with a similar workload (several mrtg
> processes, apc UPS on serial port, amanda client) which never show this
> problem, so it may be a i386-specific issue.