Subject: Serious SCB Timeout problems in 2.0.2 on Alpha w/ fxp devices
To: NetBSD/alpha Discussion List <port-alpha@netbsd.org>
From: Stephen Jones <smj@cirr.com>
List: port-alpha
Date: 06/25/2005 20:41:38
I've submitted a bug report in hopes that we can get this fixed.  Is 
anyone currently
looking on this?  What changed from 1.6.2 to 2.0.2 that would cause the 
race condition
that produces this symptom?  Should I hire a NetBSD developer to fix 
this since Alpha is
consider a tier 2 platform now?  This probably isn't going to be a 
driver fix as the driver
seems to be the same code as in 1.6.2.

Unfortunately I can not afford to dump all our alphas in favor of the 
primary platform
for NetBSD, but I'm willing to give a developer a job if it is needed 
and I'm happy to
provide the resources to someone who believes they can fix it (or 
someone who is
working on it now and either needs more spare time or money)

Just a note, last time I posted I got a few responses from people 
saying they don't see these
problems.  No offense but, I'm not talking about a machine you do an 
install on, leave it and
call it good.  These are actual production systems that support tens of 
thousands of users and
millions of network requests a day, that were running 1.6.2 fine, and 
now have SCB timeouts
resulting in about 20% packet loss (it varies based on load through out 
the day).

On a positive note the number of elusive vnlock dead locks that 
couldn't get resolved with
1.6.x even with a contract part time developer have been reduced 
significantly under 2.0.2.
Thank you for that. (and knock on wood).

PS, I do love NetBSD and I want to help make it better.