Subject: re: ram causes panic on sun4m
To: None <port-sparc@netbsd.org>
From: John Bergman <inc@inc.net>
List: port-sparc
Date: 04/22/1999 00:23:08
	Hello. Sorry if this is a dumb or well-known issue. Would greatly
appreciate any input or education. Thanks..
	I have an SS10/41 with 160mb of ram that seems to be having
problems with NetBSD. The system has two older 16mb dimms, and two newer
64mb dimms. At first, with 160mb of ram, I was unable to boot 1.3.3 or
1.3.2 (see below for screen dump), but could boot 1.4 Alpha (all generic
kernels). 1.4 Alpha would run on the system for a length of time
(sometimes more, sometimes less), and would eventually panic, usually
while trying to compile or untar something (see below for traceback).
	I then removed the two 64mb dimms and succeeded in booting 1.3.3
(generic) with no problems (on 32mb of ram). The system seemed stable.. I
rebuilt the kernel and so forth. Rebooted to the new kernel. Fine. Left it
overnight, and found it broken today in pv_unlink, as opposed to the
kernel fault I'd seen many times in the last week (see end of message for
panic and traceback).
	Today I tried swapping in the 64mb dimms in place of the 16mb
dimms. Booted 1.3.3 like a charm. Then panicked a little later, while
compiling (see end of msg). Note, this was no longer with the generic
kernel, but with a new kernel stripped of everything but sun4m and stuff I
have.
	From testing today it seems that either the generic 1.3.3 kernel
or the stripped 1.3.3 kernel I built will work in whatever ram 
configuration I use, but never both. 1.4 Alpha always works (I would guess
it, as well the other kernels, continues to be unstable over time, but I
haven't had time to test yet). I cannot see any rationale for which 1.3.3
kernel will boot with which ram configuration (they change), but changing 
the configuration is the only thing I've found that causes anything new
to happen. Aside from last night, when both kernels booted with 32mb,
neither have worked on the same configuration since. 1.3.2 (generic) works
whenever 1.3.3 (generic) does. If anyone wants the kernel config I used,
let me know.

	Under both the 160mb and 32mb ram configurations I have swapped in
a harddisk with solaris 7, and booted / run / compiled things / etc, with
no problems.

	Here is the system attempting to boot 1.3.3 (generic) with 160mb of 
ram:

Boot device: /iommu/sbus/espdma@f,400000/esp@f,800000/sd@0,0   File and args: 
>> NetBSD/sparc Secondary Boot, Revision 1.7
>> (pk@flambard, Sun Dec 13 14:22:26 MET 1998)
Booting netbsd @ 0x4000
1392640+117080+74624+[78312+91722]=0x1b0512
console is ttya
Copyright (c) 1996, 1997, 1998
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

data fault: pc=0xf000a1d8 addr=0xf01e6ff8 sfsr=3a7<FAV,OW>
panic: kernel fault
halted

-------------------
Here is the system with 160mb of ram, running 1.4 Alpha, while in the midst of
building a kernel:

data fault: pc=0xf000a33c addr=0xf0306ff8 sfsr=3a6<PERR=0,LVL=3,AT=5,FT=1,FAV,OW>
panic: kernel fault
syncing disks... 16 16 12 4 done
Frame pointer is at 0xf7c79a28
Call traceback:
  pc = 0xf01ad78c  args = (0x419000e6, 0x41900fe6, 0x0, 0x0, 0xf7c79b40, 0x419000e6, 0xf7c79a90) fp = 0xf7c79a90
  pc = 0xf005df18  args = (0x100, 0x0, 0x0, 0x0, 0xf7c79bb4, 0x419000e7, 0xf7c79af8) fp = 0xf7c79af8
  pc = 0xf01bd868  args = (0xf01bcf68, 0x100, 0xf0306ff8, 0xf7c79bc0, 0x1e, 0xf0216800, 0xf7c79b60) fp = 0xf7c79b60
  pc = 0xf0008518  args = (0x0, 0x3a6, 0xf0306ff8, 0xf7c79c68, 0xefffeea8, 0x4e448, 0xf7c79c08) fp = 0xf7c79c08
  pc = 0xf01bb13c  args = (0xf0306000, 0xff8, 0xf02c6018, 0xf01e4000, 0x2b219e, 0x20, 0xf7c79cb8) fp = 0xf7c79cb8
  pc = 0xf014c5fc  args = (0x2b21000, 0x210, 0xf021bc00, 0xf01bb094, 0xf0225000, 0x0, 0xf7c79d20) fp = 0xf7c79d20
  pc = 0xf0146378  args = (0xf04cb50c, 0xe4a, 0xf0216988, 0x0, 0x0, 0x61000, 0xf7c79d88) fp = 0xf7c79d88
  pc = 0xf01bd778  args = (0xf04cb50c, 0x4, 0x7, 0xffffffff, 0xf7b03a28, 0xf0216800, 0xf7c79ea8) fp = 0xf7c79ea8
  pc = 0xf0008518  args = (0xf7afa840, 0x386, 0x61000, 0xf7c79fb0, 0x0, 0x0, 0xf7c79f50) fp = 0xf7c79f50
  pc = 0xd508  args = (0x61000, 0x5d90c, 0x0, 0x56000, 0xf01fc800, 0x0, 0xefffef78) fp = 0xefffef78

-------------------
Here is the system booting 1.3.3 with 32mb.


NetBSD 1.3.3 (GENERIC) #0: Mon Dec 14 08:11:28 MET 1998
    pk@flambard:/usr/src/sys/arch/sparc/compile/GENERIC
real mem = 32894976
avail mem = 28532736
using 401 buffers containing 1642496 bytes of memory
bootpath: /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000/sd@0,0
mainbus0 (root): SUNW,SPARCstation-10
cpu0 at mainbus0: mid 8: TMS390Z50 v1 @ 40.300 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled
obio0 at mainbus0
clock0 at obio0 addr 0xf1200000: mk48t08 (eeprom)
timer0 at obio0 addr 0xf1300000 delay constant 18
zs0 at obio0 addr 0xf1100000 pri 12, softpri 6
zstty0 at zs0 channel 0 (console)
zstty1 at zs0 channel 1
zs1 at obio0 addr 0xf1000000 pri 12, softpri 6
kbd0 at zs1 channel 0
ms0 at zs1 channel 1
fdc0 at obio0 addr 0xf1700000 pri 11, softpri 4: chip 82077
auxreg0 at obio0 addr 0xf1800000
power0 at obio0 addr 0xf1a01000
iommu0 at mainbus0 ioaddr 0xe0000000: version 0x1/0x0, page-size 4096, range 64MB
sbus0 at iommu0: clock = 20 MHz
dma0 at sbus0 slot 15 offset 0x400000: rev 2
esp0 at dma0 slot 0xf offset 0x800000 pri 4: ESP200, 40MHz, SCSI ID 7
scsibus0 at esp0: 8 targets
probe(esp0:0:0): max sync rate 10.00Mb/s
sd0 at scsibus0 targ 0 lun 0: <FUJITSU, M2934S-512, 0140> SCSI2 0/direct fixed
sd0: 4153MB, 3421 cyl, 18 head, 138 sec, 512 bytes/sect x 8506782 sectors
ledma0 at sbus0 slot 15 offset 0x400010: rev 2
le0 at ledma0 slot 0xf offset 0xc00000 pri 6: address 08:00:20:12:cd:60
le0: 8 receive buffers, 2 transmit buffers
SUNW,bpp at sbus0 slot 15 offset 0x4800000 not configured
SUNW,DBRIe at sbus0 slot 15 offset 0x8010000 not configured
root on sd0a dumps on sd0b
root file system type: ffs

------
Here is the system, when I found it today, after leaving it
overnight, doing nothing (1.3.3 stripped, 32mb):

panic: pv_unlink
syncing disks... 9 9 6 done
Frame pointer is at 0xf2ae16c8
Call traceback:
  pc = 0xf00e94f0  args = (0x0, 0x41900fe1, 0xf0109800, 0xf2ae17e8, 0xf0002000, 0x3, 0xf2ae1730) fp = 0xf2ae1730
  pc = 0xf002638c  args = (0x100, 0x0, 0x1, 0xf07ecc80, 0xf0288790, 0xf2ae1c90, 0xf2ae1798) fp = 0xf2ae1798
  pc = 0xf00ebf58  args = (0x100, 0xf0286790, 0xf010b600, 0xf00ebc00, 0x0, 0xf2ae1c9c, 0xf2ae1800) fp = 0xf2ae1800
  pc = 0xf00ef23c  args = (0xf0286790, 0xf010b668, 0xf0176000, 0xf0181000, 0x10579000, 0xc, 0xf2ae1868) fp = 0xf2ae1868
  pc = 0xf00ef0bc  args = (0xf010b668, 0xf0176000, 0x7, 0x0, 0xf0188fb0, 0x91b9e, 0xf2ae18d0) fp = 0xf2ae18d0
  pc = 0xf00bd1dc  args = (0xf010b668, 0xf0176000, 0x5, 0x7, 0x0, 0xf2ae19cc, 0xf2ae1938) fp = 0xf2ae1938
  pc = 0xf00f17c8  args = (0x0, 0xffdf, 0x1, 0xf011ba5c, 0x0, 0x3, 0xf2ae19e0) fp = 0xf2ae19e0
  pc = 0xf0006238  args = (0x9, 0x3a6, 0xf0176ff8, 0xf0176000, 0xf0007a54, 0xf2ae1ae8, 0xf2ae1a88) fp = 0xf2ae1a88
  pc = 0xf00f0028  args = (0xf0176000, 0xff8, 0x106999e, 0xf013ffa0, 0xf0002000, 0x3, 0xf2ae1b38) fp = 0xf2ae1b38
  pc = 0xf00c4cfc  args = (0x10699000, 0x250, 0xf011bad4, 0xf07ecc80, 0xf0288790, 0xf2ae1c90, 0xf2ae1ba0) fp = 0xf2ae1ba0
  pc = 0xf00bc7c8  args = (0xf02e3cc0, 0xf07ecc80, 0xf07ecc80, 0x1000, 0xf2ae1ca0, 0xf2ae1c9c, 0xf2ae1c08) fp = 0xf2ae1c08
  pc = 0xf00bd38c  args = (0x0, 0xffdf, 0x1, 0xf011ba5c, 0x1, 0x7, 0xf2ae1cb0) fp = 0xf2ae1cb0
  pc = 0xf00bfc20  args = (0x0, 0xf2ae8000, 0xf2aea000, 0xf07c2a00, 0x2000, 0x1, 0xf2ae1d18) fp = 0xf2ae1d18
  pc = 0xf00bd7f0  args = (0xf0290000, 0xf2ae8000, 0xf2aea000, 0x0, 0x0, 0x0, 0xf2ae1d88) fp = 0xf2ae1d88
  pc = 0xf0019214  args = (0xf07c2a00, 0xf07ef400, 0xffffffff, 0xf010b400, 0x1, 0x0, 0xf2ae1df0) fp = 0xf2ae1df0
  pc = 0xf0018c84  args = (0xf07c2a00, 0x0, 0xf2ae1f20, 0x7, 0x0, 0xf2ae1e94, 0xf2ae1e58) fp = 0xf2ae1e58
  pc = 0xf00f1c60  args = (0xf07c2a00, 0xf2ae1f28, 0xf2ae1f20, 0xf0018c78, 0x0, 0x3, 0xf2ae1ec0) fp = 0xf2ae1ec0
  pc = 0xf00064d0  args = (0x2, 0xf2ae1fb0, 0x0, 0xae, 0x5e08, 0xf2ae1fb0, 0xf2ae1f50) fp = 0xf2ae1f50
  pc = 0xb98c  args = (0x1, 0x0, 0x5dc00, 0x3, 0x1, 0xf9e8e, 0xeffff860) fp = 0xeffff860
rebooting

---------
Here is the system dying after I swapped 16mb dimms in favor of 64mb dimms
(1.3.3, 128mb):

14 14 10 4 done
Frame pointer is at 0xf798b8c0
Call traceback:
  pc = 0xf00e94f0  args = (0x0, 0x41900fe4, 0xf0109800, 0xf798b9e0, 0x0, 0x1, 0xf798b928) fp = 0xf798b928
  pc = 0xf002638c  args = (0x100, 0x0, 0x1, 0xf09adf80, 0x128bc, 0xf798be88, 0xf798b990) fp = 0xf798b990
  pc = 0xf00ebf58  args = (0x100, 0xf01bf900, 0xf010b600, 0xf00ebc00, 0x0, 0xf798be94, 0xf798b9f8) fp = 0xf798b9f8
  pc = 0xf00ef23c  args = (0xf01bf900, 0xf010b668, 0xf0176000, 0xf0181000, 0x3e90000, 0xc, 0xf798ba60) fp = 0xf798ba60
  pc = 0xf00ef0bc  args = (0xf010b668, 0xf0176000, 0x7, 0x0, 0xf01b95f0, 0x397f9e, 0xf798bac8) fp = 0xf798bac8
  pc = 0xf00bd1dc  args = (0xf010b668, 0xf0176000, 0x11, 0x7, 0x0, 0xf798bbc4, 0xf798bb30) fp = 0xf798bb30
  pc = 0xf00f17c8  args = (0x0, 0xffdf, 0x1, 0xf011ba5c, 0x0, 0x3, 0xf798bbd8) fp = 0xf798bbd8
  pc = 0xf0006238  args = (0x9, 0x3a6, 0xf0176ff8, 0xf0176000, 0xf0007a54, 0xf798bce0, 0xf798bc80) fp = 0xf798bc80
  pc = 0xf00f0028  args = (0xf0176000, 0xff8, 0x3fb09e, 0xf011ba5c, 0x0, 0x1, 0xf798bd30) fp = 0xf798bd30
  pc = 0xf00c4cfc  args = (0x3fb0000, 0x250, 0xf011bad4, 0xf09adf80, 0x128bc, 0xf798be88, 0xf798bd98) fp = 0xf798bd98
  pc = 0xf00bc7c8  args = (0xf03a7ba8, 0xf09adf80, 0xf09adf80, 0x7e000, 0xf798be98, 0xf798be94, 0xf798be00) fp = 0xf798be00
  pc = 0xf00f17f4  args = (0x0, 0xffdf, 0x1, 0xf011ba5c, 0x0, 0x3, 0xf798bea8) fp = 0xf798bea8
  pc = 0xf0006238  args = (0xf0948200, 0x386, 0xefffee84, 0xefffe000, 0x1001bc80, 0xf798bfb0, 0xf798bf50) fp = 0xf798bf50
  pc = 0x1001bc6c  args = (0x0, 0x14, 0x6, 0xf0990b48, 0xf0002000, 0x0, 0xefffee20) fp = 0xefffee20
rebooting

------------
I'd be happy to provide any other debugging information or assistance, if
needed. Thanks much.

							john