Subject: Re: possible new "simple_lock: locking against myself" bug on dual-CPU AS4000
To: NetBSD port-alpha List <port-alpha@netbsd.org>
From: Chuck Silvers <chuq@chuq.com>
List: port-alpha
Date: 10/21/2005 18:43:53
On Mon, Oct 17, 2005 at 07:32:07PM -0400, Greg A. Woods wrote:
> db{1}> trace
> cpu_Debugger() at cpu_Debugger+0x4
> _simple_lock() at _simple_lock+0x140
> pmap_do_tlb_shootdown() at pmap_do_tlb_shootdown+0x90
> alpha_ipi_process() at alpha_ipi_process+0xc4
> interrupt() at interrupt+0x90
> XentInt() at XentInt+0x1c
> --- interrupt (from ipl 5) ---
> _simple_lock() at _simple_lock+0x358
> pmap_do_tlb_shootdown() at pmap_do_tlb_shootdown+0x90
> alpha_ipi_process() at alpha_ipi_process+0xc4
> interrupt() at interrupt+0x90
> XentInt() at XentInt+0x1c
> --- interrupt (from ipl 0) ---
> _lockmgr() at _lockmgr+0x1018
> _kernel_proc_lock() at _kernel_proc_lock+0x6c
> syscall_plain() at syscall_plain+0x38
> XentSys() at XentSys+0x5c
> --- syscall (198) ---
> --- user mode ---
> db{1}> 

a second IPI interrupt is being delivered while one is already being
processed.  it seems unlikely that this is a general bug in the alpha
interrupt code, or everyone would be seeing all kinds of crashes.

did this machine have the latest firmware installed?  buggy firmware
seems the most likely non-hardware cause of this kind of thing.

-Chuck