Subject: sparc64 reliability
To: NetBSD Port Sparc64 <port-sparc64@netbsd.org>
From: Peter Eisch <peter@boku.net>
List: port-sparc64
Date: 10/08/2004 17:21:51
It's been a while since the thread came around and I didn't chime in then,
but I have a few datapoints:

U5/U10 with factory drives perform alright overall.  I use them for mail
relays (333MHz) and when they panic it  was always in a pmap call (1.6.2
kernel on 1.6.1 userland) but I haven't run them on -current or 2.0_* yet.
It's worth noting that when they pop, they're really under _disk_ I/O duress
but network I/O doesn't seem to rattle them.

The same systems with a different disk (perhaps a  WD300) doing the same
application with the same config and the system won't last a day.

Same applications on an U60 runs fine for months.  It will hang but I can't
tell what the issue is as the serial console is unresponsive but cycling
power always brings it back to life.

Same applications on a Netra X1 (400MHz) running 2.0_BETA chugs like a
workhorse now for a month now.  It's being moved to a permanent deployment
this weekend,  so I'll get a better datapoint down the road.

A Netra T1 running as a router with zebra under -current is now 3+ months
without an issue beyond when I took it down to add RAM.  Putting this in as
my primary BGP router (2 T1 peers) made me feel like a cowboy.  I'm still
not certain that it's a safe thing, perhaps I've been lucky.

This weekend I'll turn up a stock U5 as a mail handler and see if it's
better  with 2.0_* than it was with 1.6.x.  My general experience though is
the low-end servers tend to have that smooth and reliable  feeling that I
get from running the i386 servers with  the same 1.6.x and applications.
Sun "desktop" systems being used as servers is cumbersome and risky at best.
Over the last month I've replaced the U5/U10's with i386 servers running
2.0_* and life became sane.

$0.02.

peter