Subject: port-sparc64/37499: sparc64 GENERIC / INSTALL kernels occasional panic on boot
To: None <port-sparc64-maintainer@netbsd.org, gnats-admin@netbsd.org,>
From: None <rafal@netbsd.org>
List: netbsd-bugs
Date: 12/08/2007 02:40:00
>Number:         37499
>Category:       port-sparc64
>Synopsis:       sparc64 GENERIC / INSTALL kernels occasional panic on boot
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Dec 08 02:40:00 +0000 2007
>Originator:     Rafal Boni
>Release:        NetBSD 4.99.36 up to 4.99.40 (maybe earlier too?)
>Organization:
Wazzat?
>Environment:
System: NetBSD v120 4.99.40 NetBSD 4.99.40 (GENERIC) #0: Wed Dec  5 01:12:11 PST 2007  builds@wb25:/home/builds/ab/HEAD/sparc64/200712040002Z-obj/home/builds/ab/HEAD/src/sys/arch/sparc64/compile/GENERIC sparc64
Architecture: sparc64
Machine: sparc64
>Description:
	On bootup, I've seen occasional crashes over the last couple of 
	weeks in gem_intr during autoconfiguration.  The last one just
	happened with a INSTALL kernel on my v120.

	This is somewhat frightening in that the kernel shouldn't be
	allowing interrupts in at this point AFAIR...

	Here's a cut-and-paste of the tail console log before the crash
	and the annotated (by hand from the netbsd-INSTALL.symbols) stack
	trace:

[...]
esiop1 at pci2 dev 8 function 1: Symbios Logic 53c896 (ultra2-wide scsi)
esiop1: using on-board RAM
esiop1: interrupting at ivec 1820
scsibus1 at esiop1: 16 targets, 8 luns per target
satalink0 at pci2 dev 5 function 0
satalink0: Adaptec AAR-1210SA serial ATA RAID controller (rev. 0x02)
satalink0: using ivec 15 for native-PCI interrupt
atabus2 at satalink0 channel 0
atabus3 at satalink0 channel 1
pcons at mainbus0 not configured
cpu0: data fault: pc=11379b0 addr=0
kernel trap 30: data access exception
Stopped in pid 0.1 (system) at  0x11379b0:      ldx             [%l0 + 0x10], %g2
db> bt
?(0, 0, e0017ed0, f, 1138ec4, 2) at 0x1138f48			(gem_intr)
?(15abc00, 1809fb8, 291df34, 0, 1549768, ff0000) at 0x100911c 	(sparc_intr_retry?)
?(1578000, 1818924, 0, 0, 0, 1579a30) at 0x137fc70		(cpu_configure)
?(1d76790, 2, 1d98400, 12888a8, 1577e28, 1da5800) at 0x1261a24	(configure)
?(0, 1234f90, 20, f00618b8, 1da6f50, ff00) at 0x123517c		(main)
?(f00618b8, fffc5cf8, 110000, 10eea0, fffc5df8, 0) at 0x100967c	(start)

>How-To-Repeat:
	Random; seems to happen more when net-booting both on my v120 and
	Netra T1 AC200?  I've been rebooting a lot on both these machines
	as I was installing/testing/checking potential fixes for PR 
	kern/25462.  Now I'm rebooting more again trying to clean up the
	config and checking out SATA support on sparc64 ;)

>Fix:
	Unknown