Port-sparc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SparcStation 20 system freezes



In article <CAH4fO9VVKGk2zZFVbhZzDm1h6vrFk6zgkSawqFa_p2C15UA3rQ%mail.gmail.com@localhost>,
Andrew Danson  <ajdanson80%gmail.com@localhost> wrote:
>-=-=-=-=-=-
>
>Hi,
>
>I'm new to the mailing list, so please forgive me if I don't follow the
>netiquette.
>
>I have a Sparcstation 20 that I used to run NetBSD 4.01 on for quite a few
>years. I used to use it as a headless server, until I retired it when I got
>a keyboard, mouse and a cg6 frame buffer.
>Recently I decided to try out NetBSD 7.1 on it, but even during the install
>process I'd get what at first seemed like random system hangs. After
>resolving some hardware issues I managed to get the OS installed, but it
>would still hang after a period of time. I ran the system diagnostics and
>did some testing and the base hardware appears to be working.
>
>A little bit of reading gave me a clue that perhaps the HDD and the esp
>driver didn't get along (anymore) so I changed it for an IBM Ultrastar
>which did make a difference, but wasn't a fix. I was kinda surprised (and
>honestly a little annoyed) that the Fujitsu drive that formerly worked
>quite well on NetBSD 4.01 wouldn't now. Especially as I have a bunch of
>spare Fujitsu drives and few others.
>
>However under specific condition I can still create what looks like the
>same hang. Basically if I read/write data to the HDD fast enough the system
>hangs again. I have a SMP (3 processor) system which I'd heard wasn't
>always stable under load, but that seems ok as I can load up the processors
>with work as long as there isn't much disk I/O. This is repeatable.
>
>Understandably this makes it hard to add packages and do other basic tasks,
>and once it has hung once it has to run fsck on reboot which itself can
>cause the problem.
>
>I built my own kernel and changed some of the options to disable tagged
>queuing and sync negotiation for the esp driver, but that didn't seem to
>help.
>
>I'll include my kernel messages at the end so you can see how it's
>configured.
>The questions I have are: Can I change my system configuration to avoid
>this hang? a different HDD? Would running the system with a uni-processor
>kernel be a potential fix, in case of a race condition or similar problem.
>Do I have a hardware fault I don't know about yet?
>
>Whilst I can't make a patch/fix I'd be happy to try one out (if anyone has
>one). I've got an emulated machine running in Qemu I've used to build
>packages and the kernel.
>
>As an aside I have used the 4.01 install disk to run fsck on my HDD a few
>times as it doesn't have the issue at all with exactly the same hardware
>and didn't seem to have any ill effect on my file system.
>
>Andrew
>
>Output of dmesg following (after copyright notice)...
>
>NetBSD 7.1 (GENERIC.MP.201703111743Z)
>total memory = 287 MB
>avail memory = 276 MB
>kern.module.path=/stand/sparc/7.1/modules
>timecounter: Timecounters tick every 10.000 msec
>bootpath: /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f
>,800000/sd@3,0
>mainbus0 (root): SUNW,SPARCstation-20: hostid 72773fc6
>cpu0 at mainbus0: mid 8: TMS390Z50 v0 or TMS390Z55 @ 50 MHz, on-chip FPU
>cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external
>(32 b/l): cache enabled
>cpu1 at mainbus0: mid 9: TMS390Z50 v0 or TMS390Z55 @ 50 MHz, on-chip FPU
>cpu1: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external
>(32 b/l): cache enabled
>cpu2 at mainbus0: mid 10: TMS390Z50 v0 or TMS390Z55 @ 60 MHz, on-chip FPU
>cpu2: physical 20K instruction (64 b/l), 16K data (32 b/l), 1024K external
>(32 b/l): cache enabled
>sx0 at mainbus0 ioaddr 0x80000000
>sx0: architecture rev. 27 chip rev. 0
>obio0 at mainbus0
>clock0 at obio0 slot 0 offset 0x200000: mk48t08
>timer0 at obio0 slot 0 offset 0x300000: delay constant 23, frequency =
>2000000 Hz
>timer: limit 0 shift 9 mask 3fffff
>timecounter: Timecounter "timer-counter" frequency 2000000 Hz quality 100
>zs0 at obio0 slot 0 offset 0x100000 level 12 softpri 6
>zstty0 at zs0 channel 0
>zstty1 at zs0 channel 1
>zs1 at obio0 slot 0 offset 0x0 level 12 softpri 6
>zstty4 at zs1 channel 0
>kbd0 at zstty4 (console input)
>zstty5 at zs1 channel 1
>ms0 at zstty5
>wsmouse0 at ms0 mux 0
>fdc0 at obio0 slot 0 offset 0x700000 level 11: no drives attached
>auxreg0 at obio0 slot 0 offset 0x800000
>power0 at obio0 slot 0 offset 0xa01000 level 2
>iommu0 at mainbus0 ioaddr 0xe0000000: version 0x3/0x1, page-size 4096,
>range 64MB
>sbus0 at iommu0: clock = 20 MHz
>dma0 at sbus0 slot 15 offset 0x400000: DMA rev 2
>esp0 at dma0 slot 15 offset 0x800000 level 4: ESP200, 40MHz, SCSI ID 7
>scsibus0 at esp0: 8 targets, 8 luns per target
>ledma0 at sbus0 slot 15 offset 0x400010: DMA rev 2
>le0 at ledma0 slot 15 offset 0xc00000 level 6: address 08:00:20:77:3f:c6
>le0: 8 receive buffers, 2 transmit buffers
>bpp0 at sbus0 slot 15 offset 0x4800000 level 2 (ipl 3): DMA rev 2
>dbri0 at sbus0 slot 14 offset 0x10000 level 9: rev e
>cgsix0 at sbus0 slot 3 offset 0x0 level 9: SUNW,501-2253, 1152 x 900, rev
>11 (console)
>cgsix0: attached to /dev/fb0
>cgsix0: framebuffer size: 2 MB
>wsdisplay0 at cgsix0 kbdmux 1: console (std, vt100 emulation)
>wsmux1: connecting to wsdisplay0
>eccmemctl0 at mainbus0 ioaddr 0x0: version 0x0/0x2
>timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
>cpu0: booting secondary processors: cpu1 cpu2
>scsibus0: waiting 2 seconds for devices to settle...
>wskbd0 at kbd0: console keyboard, using wsdisplay0
>sd0 at scsibus0 target 3 lun 0: <IBM-PSG, DDYS-T18350M  M, SA2A> disk fixed
>sd0: 17357 MB, 15110 cyl, 6 head, 392 sec, 512 bytes/sect x 35548320 sectors
>sd0: sync (100.00ns offset 15), 8-bit (10.000MB/s) transfers, tagged
>queueing
>cd0 at scsibus0 target 6 lun 0: <TOSHIBA, XM-4101TASUNSLCD, 1084> cdrom
>removable
>cd0: async, 8-bit transfers
>Kernelized RAIDframe activated
>dbri0: speakerbox detected
>dbri0: cs4215 rev E found at offset 8
>audio0 at dbri0: full duplex, playback, capture, mmap
>root on sd0a dumps on sd0b
>root file system type: ffs

How does it hang? Does it freeze completely? Or can you get into the
debugger with <ctrl><alt><esc>?

I would compile a DIAGNOSTIC/DEBUG/LOCKDEBUG kernel and try running
with that.

christos



Home | Main Index | Thread Index | Old Index