Subject: port-alpha/5546: port-alpha/lost a stack? exception_restore_regs bombs
To: None <>
From: None <>
List: netbsd-bugs
Date: 06/05/1998 12:12:25
>Number:         5546
>Category:       port-alpha
>Synopsis:       exception_restore_regs bombs
>Confidential:   yes
>Severity:       critical
>Priority:       high
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jun  5 12:20:01 1998
>Originator:     Matthew Jacob
	NASA Ames Research Center
>Release:        1.3E
System: NetBSD 1.3E NetBSD 1.3E (GENERIC) #2: Thu Jun 4 12:21:40 PDT 1998 alpha

Running on a 128MB Alpha 8200, running a moderate disk exerciser,
the system panic'ed (here's the extended printout, with the
PAL logout area registers, perhaps Ross who knows PAL code
better, could tell us in more detal what cooks-
this printout code isn't checked in yet):

      Processor Machine Check (670), Code 0x100000096
	PAL temp[0-1]		= 0x0000000000000000 0x0000006164000000
	PAL temp[2-3]		= 0xfffffc00003004d4 0x0000000000008680
	PAL temp[4-5]		= 0xfffffe00003a375c 0x0000000000000006
	PAL temp[6-7]		= 0x0000000000000001 0xfffffc00003003e8
	PAL temp[8-9]		= 0x1f1e161514020100 0xfffffc0000300474
	PAL temp[10-11]		= 0xfffffc0000300354 0xfffffc0000300418
	PAL temp[12-13]		= 0xfffffc00003003b8 0x0000005555400000
	PAL temp[14-15]		= 0x0000000000000000 0x00000000040385d9
	PAL temp[16-17]		= 0x0000009806700801 0x0000000000000000
	PAL temp[18-19]		= 0x00000001fffff418 0xfffffe00073c59d8
	PAL temp[20-21]		= 0x0000000006778000 0xfffffc0000300444
	PAL temp[22-23]		= 0xfffffc000053c1d0 0x000000000673a000
	shadow[0-1]			= 0x0000000000000000 0x0000000000000000
	shadow[2-3]			= 0x0000000000000000 0x0000000000000000
	shadow[4-5]			= 0x0000000000000000 0x0000000000000000
	shadow[6-7]			= 0x0000000000000000 0x0000000000000000

        Excepting Instruction Addr     = 0xfffffc0000300354
        Summary of arithmetic traps    = 0x0000000000000000
        Exception mask                 = 0x0000000000000000
        Base address for PALcode       = 0x0000000000018000
        Interrupt Status Reg           = 0x0000000000000000
        Current setup of EV5 IBOX      = 0x0000006164000000
        I-CACHE Reg Data parity error  = 0x0000000000000800
        D-CACHE error Reg              = 0x0000000000000000
        Effective VA                   = 0xfffffe00003a3658
        Reason for D-stream            = 0x0000000000014350
        EV5 SCache address             = 0xffffff000001d28f
        EV5 SCache TAG/Data parity     = 0x0000000000000000
        EV5 BC_TAG_ADDR                = 0xffffff80010d6fff
        EV5 EI_ADDR Phys addr of Xfer  = 0xffffff000011d6df
        Fill Syndrome                  = 0x0000000000009000
        ei_stat reg                    = 0xfffffff004ffffff
        ld_lock                        = 0xffffff0004b363df

unexpected machine check:

    mces    = 0x1
    vector  = 0x670
    param   = 0xfffffc0000008b10
    pc      = 0xfffffc0000300354
    ra      = 0xfffffc00003002e0
    curproc = 0xfffffe00003a3600
        pid = 342, comm = diskex

panic: machine check
syncing disks... 1 1 1 done

The PC decodes as:

(gdb) x/i 0xfffffc0000300354
0xfffffc0000300354 <exception_restore_regs>:    ldq     v0,0(sp)

I'll retain the kernel and core dump if anyone wants to look at it.