Subject: kern/15592: sf "starfire" NIC driver gives "device timeout"
To: None <gnats-bugs@gnats.netbsd.org>
From: C Kane <ckane@best.com>
List: netbsd-bugs
Date: 02/12/2002 17:53:56
>Number:         15592
>Category:       kern
>Synopsis:       sf "starfire" NIC driver gives "device timeout"
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 12 17:54:00 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     C Kane
>Release:        NetBSD 1.5.2
>Organization:
>Environment:
System: NetBSD smdls5 1.5ZA NetBSD 1.5ZA (sc) #9: Wed Feb 6 02:19:37 UTC 2002 ro
ot@wws008:/usr/src/sys/arch/i386/compile/sc i386

>Description:
	smdls5 has NetBSD-1.5.2 userland with a recent 1.5ZA kernel,
	and five four-port NIC cards.  It is used for NIS/DNS/NTP.
	These are Adaptec Quartet66 (ANA-64044) cards, which have the
	64-bit PCI edge connector, but are in 32-bit PCI slots.

	The -current kernel, for the past few months, has a problem
	with this card.  There were also problems with the stock
	1.5.2 and the 1.5Y kernels.  After being up some time, generally
	a few days, one random interface will develop a "device timeout"
	problem, repeating this message fairly often:

	Feb 12 17:15:42 smdls5 /netbsd: sf17: device timeout
	Feb 12 17:16:46 smdls5 /netbsd: sf17: device timeout
	Feb 12 17:23:35 smdls5 /netbsd: sf17: device timeout

	A few days later a different random interface will start displaying
	"device timeout" errors.  And so on.  Our worst offender currently
	has trouble with 8 of 18 active interfaces.

	When an interface is having the "device timeout" errors, NIS clients
	bound via the interface will have troubles doing "ypcat passwd", taking
	many minutes instead of the several seconds it should.
	
	On reboot, a different set of interfaces will have the problem.
	I haven't noticed any pattern.  I haven't found any way to trigger
	the problem.

	Here's dmesg output for the first card:

	ppb1 at pci1 dev 1 function 0: Digital Equipment DECchip 21154 PCI-PCI Bridge (rev. 0x05)
	pci2 at ppb1 bus 2
	pci2: i/o space, memory space enabled
	sf0 at pci2 dev 4 function 0: ANA-62011 (rev 0) 10/100 Ethernet, rev. 3
	sf0: interrupting at irq 10
	sf0: Ethernet address 00:00:d1:ee:ef:2d
	sf0: 64-bit PCI slot detected
	sqphy0 at sf0 phy 1: Seeq 80220 10/100 media interface, rev. 1
	sqphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
	sf1 at pci2 dev 5 function 0: ANA-62011 (rev 0) 10/100 Ethernet, rev. 3
	sf1: interrupting at irq 9
	sf1: Ethernet address 00:00:d1:ee:ef:2e
	sf1: 64-bit PCI slot detected
	sqphy1 at sf1 phy 1: Seeq 80220 10/100 media interface, rev. 1
	sqphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
	sf2 at pci2 dev 6 function 0: ANA-62011 (rev 0) 10/100 Ethernet, rev. 3
	sf2: interrupting at irq 11
	sf2: Ethernet address 00:00:d1:ee:ef:2f
	sf2: 64-bit PCI slot detected
	sqphy2 at sf2 phy 1: Seeq 80220 10/100 media interface, rev. 1
	sqphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
	sf3 at pci2 dev 7 function 0: ANA-62011 (rev 0) 10/100 Ethernet, rev. 3
	sf3: interrupting at irq 7
	sf3: Ethernet address 00:00:d1:ee:ef:30
	sf3: 64-bit PCI slot detected
	sqphy3 at sf3 phy 1: Seeq 80220 10/100 media interface, rev. 1
	sqphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

>How-To-Repeat:

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted: