Subject: Re: fxp0: Device timeouts on AlphaServers
To: None <port-alpha@netbsd.org>
From: Stephen M. Jones <smj@cirr.com>
List: port-alpha
Date: 02/16/2003 16:46:58
I'm also seeing this with CS20s (1.6.1rc1).  Two CS20s here have the same
hardware configuration, but different loads/usage .. the funny thing is the
one that has more I/O usage is not the one that is complaining so often. 
They're both NFS mounting the same disks, one runs a a webmail http server
the other runs a mud .. both handle anywhere from 30-100 users at anytime.

Machine with 396 timeouts in 1+ days:

Name  Mtu   Network       Address              Ipkts Ierrs    Opkts Oerrs Colls
fxp0  1500  <Link>        00:02:56:00:03:f1  6370825     0  4499825     2     0
fxp0  1500  10            10.0.0.4           6370825     0  4499825     2     0
fxp0  1500  fe80::/64     fe80::202:56ff:fe  6370825     0  4499825     2     0

Name  Mtu   Network       Address              Ipkts Ierrs    Opkts Oerrs Colls
fxp1  1500  <Link>        00:02:56:00:03:f2  8695523     0 11280206   970     0
fxp1  1500  216.162.208   216.162.208.195    8695523     0 11280206   970     0
fxp1  1500  fe80::/64     fe80::202:56ff:fe  8695523     0 11280206   970     0


Machine with 107 timeouts in 6+ days:

Name  Mtu   Network       Address              Ipkts Ierrs    Opkts Oerrs Colls
fxp0  1500  <Link>        00:02:56:00:0f:41 55517683     0 35751524    60     0
fxp0  1500  10/24         10.0.0.5          55517683     0 35751524    60     0
fxp0  1500  fe80::/64     fe80::202:56ff:fe 55517683     0 35751524    60     0

Name  Mtu   Network       Address              Ipkts Ierrs    Opkts Oerrs Colls
fxp1  1500  <Link>        00:02:56:00:0f:42 25359266     0 24861122   309     0
fxp1  1500  216.162.208   216.162.208.196   25359266     0 24861122   309     0
fxp1  1500  fe80::/64     fe80::202:56ff:fe 25359266     0 24861122   309     0

I think the DS10Ls and similar would be a victim of this race condition.
If there is a patch or work around, I would be interested as well.

The unfortunate part of this story is that I'm planning on replacing our
AS1200 (four) with DS10L which have built in fxp devices.  But hopefully we'll
get a work around together before to long.

Other than occassional lag from timeouts, 1.6.1rc1 does seem very stable!