Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Strange NVRAM corruption on Sun Blade 2500



Hey folks,

I have a Silver Sun Blade 2500 which shows a strange problem:

I always power it off by using "shutdown -r" and then at the OF prompt
use power-off.

If I do shutdown -p instead, the NVRAM gets "corrupted" somehow. On next
power up the machine will fail like this:


@(#)OBP 4.30.4.a 2010/01/06 14:47 Sun Blade 2500 (Silver)
Clearing TLBs 
Loading Configuration

Membase: 0000.0013.0000.0000
MemSize: 0000.0000.4000.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Block Scrubbing Done
Find dropin, Copying Done, Size 0000.0000.0000.8330
PC = 0000.07ff.f000.7248
PC = 0000.0000.0000.72f8
Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.8ad0
Diagnostic console initialized
System Reset: CPU Reset 
Probing system devices
jbus at 0,0 SUNW,UltraSPARC-IIIi (1600 MHz @ 10:1, 1 MB) memory-controller 
jbus at 1,0 SUNW,UltraSPARC-IIIi (1600 MHz @ 10:1, 1 MB) memory-controller 
jbus at 1c,0 pci ppm 
jbus at 1d,0 pci 
jbus at 1e,0 pci ppm 
jbus at 1f,0 pci i2c 
/i2c@1f,464000: nvram idprom 
Loading Support Packages: kbd-translator obp-tftp SUNW,i2c-ram-device 
SUNW,fru-device SUNW,asr 
Loading onboard drivers: 
/pci@1e,600000: Device 7 isa 
/pci@1e,600000/isa@7: flashprom rtc i2c power serial serial dma 
/pci@1e,600000/isa@7/i2c@0,320: i2c-bridge gpio hardware-monitor 
hardware-monitor hardware-monitor gpio gpio audio-card-fru-prom 
motherboard-fru-prom scsi-backplane-fru-prom dimm-spd dimm-spd dimm-spd 
dimm-spd dimm-spd dimm-spd dimm-spd dimm-spd clock-generator 
/pci@1e,600000/isa@7/dma@0,0: parallel 

ERROR: Last Trap: Fast Data Access MMU Miss

debug: 

The "debug:" prompt is just a restricted version of the "ok" prompt, but
booting is disabled. A "set defaults" does not fix it, but after some playing
around with NVRAM values, setting things back and forth (details of which I
unfortunately did not log) the machine worked again - and then I restored
all settings to normal.

This machine does not have the typical combined clock/nvram chip, but
uses an extra Atmel 24C64B serial EEPPROM (mounted in a "holder" to
allow easy replacement and hidden below a Sun host ID label, but stock
PL27 package inside).

First time through I actually thought this chip would be dead and got an
(empty) replacement, which imediately fixed the issue.

So my theory is that somehow the kernel call into OF to do the poweroff
messes up some parts of the eeprom content, which "set defaults" does not
cover. Or something. Maybe I should try to get access to that chip from
some other device and dump full content, with working and non-working
versions and cmpare.

However, since we call OF to do the power-off, why does it work from the
ok prompt, but not from within the NetBSD kernel? I don't feel happy trying
this a lot ;-)

Does anyone see something similar?

Any ideas?


Martin


Home | Main Index | Thread Index | Old Index