Subject: Re: NetBSD2.0/sparc not ready for prime time?
To: Johan A.van Zanten <johan@giantfoo.org>
From: Seth Kurtzberg <seth@cql.com>
List: port-sparc64
Date: 02/06/2005 03:16:50
Johan A.van Zanten wrote:

>Seth Kurtzberg <seth@cql.com> wrote:
>
>  
>
>>That is again _not_ to imply that there aren't problems, and that they 
>>don't need to be fixed.  It is merely to say that you have good people 
>>already expending maximum effort, and, in my experience, NetBSD 
>>releases, initially, are of much higher quality than other operating 
>>systems I've used, including expensive SVR4 releases from Sun or HP.
>>    
>>
>
> I want to second this.  I began my career as a sysadmin back in
>1991-1992, and the early releases of Solaris were very buggy.  I didn't
>put Solaris into production until 1994, and i think that was Solaris 2.4.
>  
>
Don't forget the version where they had minor version numbers, and it 
was unusable in 2.x or 2.x.1.  I believe that was 2.6, so 2.6 was 
useless, as was 2.6.1, and everyone had to wait for 2.6.2.  Useless 
might be an overstatement, since not everyone has the same criteria for 
usefulness, but it had major stability problems until the second 
correction release.

As Johan said, Sun has far better resources for testing (not in terms of 
the quality of the engineers, but in terms of money,  and tools, and the 
fact that Sun built the damn thing and one might think they would know 
how it works.

I'd also like to stress the fact that NetBSD problems are generally 
fixed in a period of a few weeks, in the worst case.  A few days is more 
typical.  Having a workaround the same day the problem is discovered is 
also common.  That workaround is not the fix, because there is a good 
chance that it will break something else, but in some cases it allows 
the user to do what he needs to do while the developers on the 
port-sparc side are doing what they need to do.

There was a major TCP/IP bug in Solaris up to and including 2.6.2.  It 
was a two line change, and it took Sun about a year to get around to 
making it.

And, try to get a response to an email you send to whoever is doing (or 
is supposed to be doing) the fix.  It's theoretically possible that you 
will get a response, but much more frequently you get a "ok, we'll look 
into it" response from someone who's job it is to maintain the database 
of problem reports.  Sooner or later it will get pushed down to the 
level that the developers can see it and take care of it, but I've never 
seen sooner, and later tends to be _very_ later.  (Bad English grammar, 
but you know what I mean, I hope.  :)  )

Now, I will admit that I'm very grateful to the sparc folks on the 
NetBSD project, because they allowed me to turn a very expensive door 
stop into a functioning server.  I have every confidence that this will 
be fixed.  Judging by the messages on the list, the root cause is known 
and there is discussion about the best way to fix it.  In the case of 
Sun, you would probably still be on hold, trying to see if your bug 
report triggered any activity, let alone knowing the cause and kicking 
around possible fixes.

These guys are human, so they make mistakes.  Believe it or not, I've 
been known to make a mistake now and then, such as breaking the 
processing of one type in my database engine while introducing a new type.

I think everyone is acting in good faith here, the NetBSD sparc people 
do an outstanding job, and things will stabilize.  These sorts of 
problems happen to everybody; the difference is that the NetBSD 
developers jump right on it and it gets fixed.  Not fixed in the next 
release (as is Sun's habit, even though there will be a patch list a 
mile long, the bug you care about somehow doesn't get into that mile 
long patch list).

I'll spare you the HP stories.  Let's just say there is a tendency among 
the users to add an S before UX.

>2.0 and 2.1 were just not usable on busy machines -- i know several other
>shops that deployed projects on Alphas instead of Suns (even after they'd
>already bought the Suns), simply because Solaris 2.0 - 2.2 were not stable
>enough.  I'm not trying to bash Sun here; it's the same problem. Shipping
>a new version of an OS, especially when it has substantial changes (like
>SMP) is impossible to do without having some bugs, even after months of
>testing with far, far more resources than the NetBSD project has. And the
>painfully irony is that a bug fix is always likely to introduce another
>bug.
>
> I've been happy with 2.0 on sparc (and alpha) so far. My main DNS server,
>KDC and mail relay is now a SPARC-20 (dual 50 Mhz) with about 300 MB of
>RAM. Yesterday, it 55x refused about 35,000 pieces of SPAM, and 45x
>refused another 45,000.  (And depressingly, it delivered about 200 pieces
>of mail, some of which were still spam.) It's a busy machine.
>
> The only major problems i've run into on sparc have all been in pkgsrc,
>which is (in my mind) entirely seperate from the OS. I'm not denying that
>there may be problems with 2.0 on sparc.  I'm simply saying that what i am
>seeing here on my machines is impressive for a x.0 release with major, new
>functionality in the kernel.
>
> -johan
>
>!DSPAM:42052362150207400181028!
>
>  
>