port-alpha: Re: AS1200 problems

Subject: Re: AS1200 problems
To: None <thorpej@wasabisystems.com>
From: David Hopper <dhop@nwlink.com>
List: port-alpha
Date: 02/01/2002 10:21:04

Jason R Thorpe wrote:
> 
>On Thu, Jan 31, 2002 at 03:42:14PM -0800, David Hopper wrote:
> 
>> I have been having frequent memory management faults on my AlphaServer 1200
>> 5/533.  
>> [...]
> 
> It's really weird that both $pc and $ra are 0.  It's almost like you
> have a trashed stack (restored bogus value into $ra, and then executed
> a ret, putting that bogus value into $pc).
> 
> Can you use gdb to tell me what $pv points to?
>         -- Jason R. Thorpe <thorpej@wasabisystems.com>

I came in this morning to yet another fault, during a build compile of
f_enum.c; this time with a full debug kernel + symbols

CPU0    trap entry = 0x2 (memory management fault)
CPU0    a0         = 0x0
CPU0    a1         = 0x1
CPU0    a2         = 0xffffffffffffffff
CPU0    pc         = 0x0
CPU0    ra         = 0x0
CPU0    pv         = 0xfffffc0000392280
CPU0    curproc    = 0xfffffc001567ce78
CPU0           pid = 22372, comm = cpp0

Stopped in pid 22372 (cpp0) at cpu_Debugger+0x4:  ret  zero,(ra)

}trace
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0x168
trap() at trap+0x93c
XentMM() at XentMM+0x20
--- memory management fault (from ipl 6) ---
f30save() at	0
f30save() at	0
--- root of call graph ---

}examine
pv:	210022

I can say with some certainty that the faults only happen on disk activity;
wild speculation here (IANAP), but I really think some interaction betwixt
isp, raidframe, and multiprocessor causes the fault.

Kernel config is a modified frau-farbissina, to match my PCI and
pseudo-devices.

One more thing:  I get stray KN300 IRQ 16 errors, but I think those are
harmless.

God, I hope it's not a hardware problem.  ;^)

Dave Hopper
dhop@nwlink.com