Subject: 4/260 memory (?) problems
To: None <port-sparc@NetBSD.ORG>
From: Charles Lepple <clepple@foo.campus.vt.edu>
List: port-sparc
Date: 01/16/1997 15:07:19
First a brief note of thanks to everyone who wrote in reassuring me that the 
ie0 errors on bootup were pretty much bogus. The machine, for the most part, 
is doing nicely, acting (sometimes) as a firewall/filtering router to prevent 
denial-of-service attacks when I run Win95 on my PC.

This plan has one hitch: every so often, I get weird "data fault" panics. 
These data faults have some correspondance to the load average, but the 
unexpected panics occurred with more or less interactive processes (shell, 
small compiles, editing) and when I tried to crash the system (to gather data) 
I had to start up 3 make processes on the directories in /usr/share/doc before 
it died.

Following is an excerpt from the panic:

daa fault: c=f8006bd8 addr=e029644 ser=8002<WRITE,SZERR>
paic: kernel flt
syncing disks.. done
Fram pointer is at 0xff554
Call traceback:
[snip]

As might be seen from the weird spellings, the 4/260 on-board zs chip seems to 
drop characters. I suspect, however, that it has something to do with the 
driver, since it seemed to work much better under SunOS. Is there an option 
similar to si_options that I can set for more conservative serial I/O?

Any ideas on either of the above problems? Is there a straightforward mapping 
between the logical address above and the physical address? The memory is 
across 4 8MB boards, but because the problem is not easily reproducible (it 
happens often, but not predictably) it would be a pain to try and isolate it 
manually.

For what it's worth, the boards pass the memory check part of the diagnostics, 
even after the machine has been running continuously for a while.

Also, does anyone have info on setting the diagnostic LEDs on the back of the 
motherboard? It's hard to tell whether the machine is running or not without 
keeping a serial line tied up on my PC.