Subject: Serious SCB Timeout problems in 2.0.2 on Alpha w/ fxp devices
To: NetBSD/alpha Discussion List <port-alpha@netbsd.org>
From: Stephen Jones <smj@cirr.com>
List: tech-kern
Date: 06/25/2005 20:41:38
I've submitted a bug report in hopes that we can get this fixed. Is
anyone currently
looking on this? What changed from 1.6.2 to 2.0.2 that would cause the
race condition
that produces this symptom? Should I hire a NetBSD developer to fix
this since Alpha is
consider a tier 2 platform now? This probably isn't going to be a
driver fix as the driver
seems to be the same code as in 1.6.2.
Unfortunately I can not afford to dump all our alphas in favor of the
primary platform
for NetBSD, but I'm willing to give a developer a job if it is needed
and I'm happy to
provide the resources to someone who believes they can fix it (or
someone who is
working on it now and either needs more spare time or money)
Just a note, last time I posted I got a few responses from people
saying they don't see these
problems. No offense but, I'm not talking about a machine you do an
install on, leave it and
call it good. These are actual production systems that support tens of
thousands of users and
millions of network requests a day, that were running 1.6.2 fine, and
now have SCB timeouts
resulting in about 20% packet loss (it varies based on load through out
the day).
On a positive note the number of elusive vnlock dead locks that
couldn't get resolved with
1.6.x even with a contract part time developer have been reduced
significantly under 2.0.2.
Thank you for that. (and knock on wood).
PS, I do love NetBSD and I want to help make it better.