Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

SparcStation 20 system freezes



Hi,

I'm new to the mailing list, so please forgive me if I don't follow the netiquette.

I have a Sparcstation 20 that I used to run NetBSD 4.01 on for quite a few years. I used to use it as a headless server, until I retired it when I got a keyboard, mouse and a cg6 frame buffer.
Recently I decided to try out NetBSD 7.1 on it, but even during the install process I'd get what at first seemed like random system hangs. After resolving some hardware issues I managed to get the OS installed, but it would still hang after a period of time. I ran the system diagnostics and did some testing and the base hardware appears to be working.

A little bit of reading gave me a clue that perhaps the HDD and the esp driver didn't get along (anymore) so I changed it for an IBM Ultrastar which did make a difference, but wasn't a fix. I was kinda surprised (and honestly a little annoyed) that the Fujitsu drive that formerly worked quite well on NetBSD 4.01 wouldn't now. Especially as I have a bunch of spare Fujitsu drives and few others.

However under specific condition I can still create what looks like the same hang. Basically if I read/write data to the HDD fast enough the system hangs again. I have a SMP (3 processor) system which I'd heard wasn't always stable under load, but that seems ok as I can load up the processors with work as long as there isn't much disk I/O. This is repeatable.

Understandably this makes it hard to add packages and do other basic tasks, and once it has hung once it has to run fsck on reboot which itself can cause the problem.

I built my own kernel and changed some of the options to disable tagged queuing and sync negotiation for the esp driver, but that didn't seem to help.

I'll include my kernel messages at the end so you can see how it's configured.
The questions I have are: Can I change my system configuration to avoid this hang? a different HDD? Would running the system with a uni-processor kernel be a potential fix, in case of a race condition or similar problem. Do I have a hardware fault I don't know about yet?

Whilst I can't make a patch/fix I'd be happy to try one out (if anyone has one). I've got an emulated machine running in Qemu I've used to build packages and the kernel.

As an aside I have used the 4.01 install disk to run fsck on my HDD a few times as it doesn't have the issue at all with exactly the same hardware and didn't seem to have any ill effect on my file system.

Andrew

Output of dmesg following (after copyright notice)...

NetBSD 7.1 (GENERIC.MP.201703111743Z)
total memory = 287 MB
avail memory = 276 MB
kern.module.path=/stand/sparc/7.1/modules
timecounter: Timecounters tick every 10.000 msec
bootpath: /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000/sd@3,0
mainbus0 (root): SUNW,SPARCstation-20: hostid 72773fc6
cpu0 at mainbus0: mid 8: TMS390Z50 v0 or TMS390Z55 @ 50 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled
cpu1 at mainbus0: mid 9: TMS390Z50 v0 or TMS390Z55 @ 50 MHz, on-chip FPU
cpu1: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled
cpu2 at mainbus0: mid 10: TMS390Z50 v0 or TMS390Z55 @ 60 MHz, on-chip FPU
cpu2: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external (32 b/l): cache enabled
sx0 at mainbus0 ioaddr 0x80000000
sx0: architecture rev. 27 chip rev. 0
obio0 at mainbus0
clock0 at obio0 slot 0 offset 0x200000: mk48t08
timer0 at obio0 slot 0 offset 0x300000: delay constant 23, frequency = 2000000 Hz
timer: limit 0 shift 9 mask 3fffff
timecounter: Timecounter "timer-counter" frequency 2000000 Hz quality 100
zs0 at obio0 slot 0 offset 0x100000 level 12 softpri 6
zstty0 at zs0 channel 0
zstty1 at zs0 channel 1
zs1 at obio0 slot 0 offset 0x0 level 12 softpri 6
zstty4 at zs1 channel 0
kbd0 at zstty4 (console input)
zstty5 at zs1 channel 1
ms0 at zstty5
wsmouse0 at ms0 mux 0
fdc0 at obio0 slot 0 offset 0x700000 level 11: no drives attached
auxreg0 at obio0 slot 0 offset 0x800000
power0 at obio0 slot 0 offset 0xa01000 level 2
iommu0 at mainbus0 ioaddr 0xe0000000: version 0x3/0x1, page-size 4096, range 64MB
sbus0 at iommu0: clock = 20 MHz
dma0 at sbus0 slot 15 offset 0x400000: DMA rev 2
esp0 at dma0 slot 15 offset 0x800000 level 4: ESP200, 40MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
ledma0 at sbus0 slot 15 offset 0x400010: DMA rev 2
le0 at ledma0 slot 15 offset 0xc00000 level 6: address 08:00:20:77:3f:c6
le0: 8 receive buffers, 2 transmit buffers
bpp0 at sbus0 slot 15 offset 0x4800000 level 2 (ipl 3): DMA rev 2
dbri0 at sbus0 slot 14 offset 0x10000 level 9: rev e
cgsix0 at sbus0 slot 3 offset 0x0 level 9: SUNW,501-2253, 1152 x 900, rev 11 (console)
cgsix0: attached to /dev/fb0
cgsix0: framebuffer size: 2 MB
wsdisplay0 at cgsix0 kbdmux 1: console (std, vt100 emulation)
wsmux1: connecting to wsdisplay0
eccmemctl0 at mainbus0 ioaddr 0x0: version 0x0/0x2
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
cpu0: booting secondary processors: cpu1 cpu2
scsibus0: waiting 2 seconds for devices to settle...
wskbd0 at kbd0: console keyboard, using wsdisplay0
sd0 at scsibus0 target 3 lun 0: <IBM-PSG, DDYS-T18350M  M, SA2A> disk fixed
sd0: 17357 MB, 15110 cyl, 6 head, 392 sec, 512 bytes/sect x 35548320 sectors
sd0: sync (100.00ns offset 15), 8-bit (10.000MB/s) transfers, tagged queueing
cd0 at scsibus0 target 6 lun 0: <TOSHIBA, XM-4101TASUNSLCD, 1084> cdrom removable
cd0: async, 8-bit transfers
Kernelized RAIDframe activated
dbri0: speakerbox detected
dbri0: cs4215 rev E found at offset 8
audio0 at dbri0: full duplex, playback, capture, mmap
root on sd0a dumps on sd0b
root file system type: ffs



Home | Main Index | Thread Index | Old Index