Subject: resetting PROM
To: None <port-sparc@netbsd.org>
From: John Towler <jtowler@pconline.com>
List: port-sparc
Date: 12/12/2000 19:33:42
Recent experiments with running NetBSD-1.4.3 have led to the
following problem which is largely in the last three paragraphs.


The system crashed within 7 days, and began to repeat the crash with
pregressively decresasing cycle times, new kernels, libraries, or no.



Reproduced in part a typical error msg on a recurring kernel panic.

Trap type 0x4: 	pc= 0xf000ab94
		npc=0xf0000ab98
		psr=4010c2<EF,S,PS>

kernel: fp disabled trap stopped in sh 
	at special_fp_store+0x18: std %f2, [%o0+0x8]


This was encountered running NetBSD-1.4.3 (11-02/4-2000 snapshot)
whilst doing a make build in the background of the 12-4 src tree.

It also occurred while running configure scripts to build new gcc and
other like contexts.  Once it occurs, the error becomes cyclic and the
period of its recurrence begins to drop from a day or two to the
reboot sequence on recovering from it.

In the hopes of detecting its cause, I build a kernel to drop into the
builtin debugger on panic.  This was not as helpful as it might seem,
as I didn't know enought about the kernel and it's state to do
anything with the options a debugger might provide.  

Frustrated, I am for the moment back to SunOS 4.1.4 (sun4c).  While
reinstalling SunOS(4.1.4), I found that the NetBSD code had still infected
the boot loader, and the fp panic error recurred.  I reformatted the
disk, and set the prom to reset to factory defaults, and tried again.

The usual time consuming hassle to break NIS and install bind over
with, upon running X11R6 with fvwm2 the fp panic error occured.  My
recollection is that hardware fp is enabled on SunOS (gcc defaults
build with no problems, it is not in gcc -v foo.c, and NetBSD's design
decision in this seemed unusual for the sparc).  Ghostview also takes
a year and a day (alright a minute) to start up, a new change not for
the better.  This did not happed when I had tried NetBSD-1.{5,4.[23]}
and reinstalled SunOS afterwards.

It seems that the enabling of the NetBSD kernel debugger caused some
change in the PROM which from time to time SunOS is still tripping
over. Even though the conditions which trigger the change for NetBSD do
not apply.  Side-effect driven code.  My question is, what else to I
have to undo in the PROM in order to not have the system break over
what the current OS does not view as an error?  My guess is that
someone with knowledge of the NetBSD kernel code would know what was
changed more quickly and accurately than my searching through the kernel
sources not knowing exactly what I am looking for.  Thanks in advance.


		
		John Towler