Subject: Re: kern/25285: i386 MP panic: TLB IPI rendezvous failed (mask 1)
To: None <dokas@cs.umn.edu>
From: Greg Oster <oster@cs.usask.ca>
List: current-users
Date: 06/08/2004 09:57:51
Paul Dokas writes:
> On Sun, 6 Jun 2004 20:59:16 -0500, Paul Dokas <dokas@cs.umn.edu> wrote:
> > 
> > The quad Xeon that I put this patch onto panic'd sometime on Friday night.
> > I'll put the detail here tomorrow after I've had a change to visit the
> > computer (I don't have a serial console on it yet).
> 
> I've looked over the machine and here's what I found.
> 
> For CPU 0, the backtrace is exactly as in my previous emails. 

Hmm.. was this posted to a list? 

> CPU 2
> and CPU 4 are both idle (for some reason it numbers the CPUs 0, 2, 4, 6)
> 
> However, CPU 6 shows something interesting.  Here's the backtrace:
> 
>   acquire()
>   spinlock_acquire_count()
>   _kernel_lock_acquire_count()
>   mi_switch()
>   ltsleep()
>   rf_RaidIOThread()
> 
> I do have 4 136GB disks in a striped raidset.  My guess is that this is
> the cause of the problem for me and probably falls into the "well, don't
> do that" category.  Although, it would be really nice to be able to
> use SMP and RaidFrame at the same time.

You're supposed to be able to...  SMP and RAIDframe were happy on a 
2-CPU box, last I checked.  I'm not sure why 4 would be different 
w.r.t. RAIDframe.  (From what I can tell, CPU 6 it's just trying to 
grab the "biglock".  If something else is holding it, it's not 
too surprising that it's just sitting there...)

> In the meanwhile, I'm going to stop using RaidFrame.

Trying a different set of deck chairs? :-}  

Later...

Greg Oster