current-users: Re: panic: TLB IPI rendezvous failed (mask 4)

Subject: Re: panic: TLB IPI rendezvous failed (mask 4)
To: None <dokas@cs.umn.edu>
From: Havard Eidnes <he@netbsd.org>
List: current-users
Date: 06/02/2004 23:29:36

> I've got a Dell PowerEdge 6600 with 4 Xeon processors, running
> -current, that I'm using as a PostgreSQL server (see below for the
> dmesg) and I'm having problems similar to those reported in PR
> 25285.  Basically, under load, I'm getting panics that look like
> this (copied by hand):
>
>   panic:  TLB IPI rendezvous failed (mask 4)
>   Stopped in pid 4552.1 (postgres) at netbsd: cpu_Debugger+0x4 leave
>
> in the debugger, a backtrace looks like this:
>
>   cpu_Debugger()
>   panic()
>   pmap_tlb_shootdown()
>   pmap_do_remove()
>   pmap_remove()
>   ubc_alloc()
>   ffs_read()
>   VOP_READ()
>   vn_read()
>   dofileread()
>   sys_read()
>   syscall_plain()
>   --- syscall (number 3) ---

At least as interesting as this is what the other CPUs are up to when
this happens.  It may be that one of the other CPUs are running at an
elevated priority level, at which IPIs are blocked (?), and/or a
deadlock situation of some sort.  I think it would help to narrow down
the cause if you can do the following from DDB per CPU (substitute <n>
with 0, 1, 2, and 3):

db> machine cpu <n>
db> trace
db> show reg

Gather this info (which is most easily done using a serial console, of
course) and post it here and/or append it to the problem report (or
submit a new one) and I'll try to get someone to look closer at it.

Regards,

- H=E5vard