Subject: SS4 Instruction Cache Problem
To: None <port-sparc@netbsd.org>
From: Adam Lebsack <adam@lebsack.com>
List: port-sparc
Date: 09/16/2002 14:44:26
Hey guys...

I have a SparcStation4 (110mhz, MB86904 processor) here that has some 
stability problems when put under a decent load (like the daily cron 
scripts for instance).  It locks up hard, with no kernel messages, and 
it doesn't drop into Open Firmware.  A power cycle is needed.  Here's 
what I get from a Power On Self Test:

...<clipped>
D-Cache RAM NTA Test
D-Cache TAG NTA Test
I-Cache RAM NTA Test
         ERROR  : Address= 00000000, exp= 55555555, obs= 55d55555, xor= 
00800000
initializing TLB
initializing cache

Allocating SRMMU Context Table
Setting SRMMU Context Register
Setting SRMMU Context Table Pointer Register
Allocating SRMMU Level 1 Table
Mapping RAM
Mapping ROM
...
If I'm not mistaken, that looks like the first page of the Instruction 
cache is not working properly, and is the root of the problem, as I 
have other sparcs running identical builds of NetBSD.  I've looked at 
sys/arch/sparc/sparc/cache.c and fiddled with it a bit, trying to 
disable the cache in the kernel, with no luck.  I'd rather have no 
instruction cache than an unstable system.

Here's an example of what I'm trying to do....
swift_cache_enable() {
...
-	pcr |= (SWIFT_PCR_ICE | SWIFT_PCR_DCE);
+	pcr |= (SWIFT_PCR_DCE);
+	pcr &= (~SWIFT_PCR_ICE);
...
}

... with no luck...  in fact, explicitly setting the ICE bit to 0 
freezes the system.  Does anyone know if it is at all possible to 
disable this cache?  Even better, perhaps put cache integrity checking 
routines in the kernel and possibly disable only the cache lines that 
are bad?  I'd be happy to work with anyone who has any information on 
this.

Thanks,
Adam Lebsack
adam@lebsack.com