Subject: Re: SPARCengine Ultra AXi bogus power failure
To: None <port-sparc64@netbsd.org>
From: Francis Devereux <francis@devrx.org>
List: port-sparc64
Date: 02/27/2003 22:14:00
On Thu, Feb 27, 2003 at 10:37:27AM -0800, Eduardo Horvath wrote:
> On Thu, Feb 27, 2003 at 03:13:21PM +0000, Francis Devereux wrote:
> > I have a Sunray workstation based on the SPARCengine Ultra AXi board.  When I
> > boot NetBSD (1.6) I get the message "Power Failure Detected: Shutting down
> > NOW." and the machine powers off.  Here are the messages I get:
> > 
> > console is /pci@1f,0/pci@1,1/ebus@1/se@14,400000:a
> > Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002
> >     The NetBSD Foundation, Inc.  All rights reserved.
> > Copyright (c) 1982, 1986, 1989, 1991, 1993
> >     The Regents of the University of California.  All rights reserved.
> > 
> > NetBSD 1.6 (KRAKEN) #0: Thu Feb 27 00:35:16 GMT 2003
> >     francis@cepre.repton.int:/usr/src/sys/arch/sparc64/compile/KRAKEN
> > total memory = 512 MB
> > avail memory = 466 MB
> > using 3289 buffers containing 26312 KB of memory
> > bootpath: /pci@1f,0/pci@1,0/scsi@1,0/disk@1,0
> > mainbus0 (root): SUNW,UltraSPARC-IIi-Engine
> > cpu0 at mainbus0: SUNW,UltraSPARC-IIi @ 440.048 MHz, version 0 FPU
> > cpu0: physical 32K instruction (32 b/l), 16K data (32 b/l), 2048K external (64 b
> > /l)
> > psycho0 at mainbus0 addr 0xfffc0000
> > SUNW,sabre: impl 0, version 0: ign 7c0 bus range 0 to 128; PCI bus 0
> > Power Failure Detected: Shutting down NOW.
> 
> I'd speculate that the firmware does interesting things to the bus
> controller that generates or fails to clear the power fail interrupt 
> during the reset sequece.  One thing to try is to clear the interrupt
> (store 0LL in the power fail interrupt clear register) just before
> installing the interrupt handler.  An interesting question that 
> should be answered is whether doing this will prevent the interrupt
> from being delivered later if there is a real power failure.  You
> could try that fix, then boot into single user mode and power off
> the machine and see if it manages to print anything.
> 
> Eduardo

I downloaded the -current source and made the following changes to psycho.c:
--- psycho.c.orig	Wed Dec 11 11:05:00 2002
+++ psycho.c	Thu Feb 27 21:54:37 2003
@@ -447,6 +447,8 @@
 		psycho_set_intr(sc, 15, psycho_bus_b,
 			&sc->sc_regs->pciberr_int_map, 
 			&sc->sc_regs->pciberr_clr_int);
+        /* clear the powerfail interrupt */
+        sc->sc_regs->power_clr_int = 0LL;
 		psycho_set_intr(sc, 15, psycho_powerfail,
 			&sc->sc_regs->power_int_map, 
 			&sc->sc_regs->power_clr_int);
@@ -749,8 +751,8 @@
 	/*
 	 * We lost power.  Try to shut down NOW.
 	 */
-	printf("Power Failure Detected: Shutting down NOW.\n");
-	cpu_reboot(RB_POWERDOWN|RB_HALT, NULL);
+	printf("Power Failure Detected: would shut down NOW without Francis' hack.\n");
+	/* cpu_reboot(RB_POWERDOWN|RB_HALT, NULL); */
 	return (1);
 }
 static 

Is that what you mean?

I also commented out the line in the interrupt handler that actually powers
down.  I now get an endless stream of "Power Failure Detected" messages, it
seems like the interrupt is being triggered repeatedly.

elf64_exec: Booting /pci@1f,0/pci@1,1/network@1,1/netbsd
3766888@0x1000000+142368@0x1800000+4051936@0x1822c20
symbols @ 0xfee42380 112+305136+162712 start=0x1000000
chain: calling OF_chain(800000, e540, 1000000, fffa5a80, 18)
[ using 468856 bytes of netbsd ELF symbol table ]
consinit()
setting up stdin
chosen = f002d974, stdin @ 0x1819218
stdin instance = fff73f48
stdin node = f0069860
setting up stdout
stdout instance = fff99e50
stdout package = f0069860
buffer @ 0x1c09d90
console is /pci@1f,0/pci@1,1/ebus@1/se@14,400000:a
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
NetBSD 1.6O (KRAKEN) #2: Thu Feb 27 21:55:36 GMT 2003
francis@cepre.repton.int:/usr/src-current/sys/arch/sparc64/compile/KRAKEN   total memory = 512 MB
avail memory = 464 MB
using 3289 buffers containing 26312 KB of memory
bootpath: /pci@1f,0/pci@1,1/network@1,1
mainbus0 (root): SUNW,UltraSPARC-IIi-Engine
cpu0 at mainbus0: SUNW,UltraSPARC-IIi @ 440.049 MHz, version 0 FPU
cpu0: 32K instruction (32 b/l), 16K data (32 b/l), 2048K external (64 b/l)      psycho0 at mainbus0 addr 0xfffc0000
SUNW,sabre: impl 0, version 0: ign 7c0 bus range 0 to 128; PCI bus 0
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
Power Failure Detected: would shut down NOW without Francis' hack.
.....

Francis