Subject: Why does this fix my problem?
To: NetBSD Networking Technical Discussion List <tech-net@netbsd.org>
From: Monroe Williams <monroe@pobox.com>
List: tech-net
Date: 12/31/2002 17:41:34
I've been having serious problems with a D-Link DFE-570TX quad Ethernet
board ever since I bought it some time ago.  If I try to use more than one
port at a time, I pretty quickly get:

tlp0: filter setup and transmit timeout
tlp0: filter setup and transmit timeout
tlp0: filter setup and transmit timeout
...

and all the ports on the card cease processing packets.

I've posted on a couple of NetBSD lists once or twice without finding a
solution.  

<http://mail-index.netbsd.org/current-users/2002/05/03/0004.html>

It's possible that there's something about the machine this card is running
in that triggers the problem.  (It's an old PowerMac 7500 running
NetBSD-macppc-current -- 1.6K at the moment.)

I was recently looking into the problem again, and noticed that all four
Tulip chips on the card end up on the same IRQ:

-----
ppb0 at pci0 dev 15 function 0: Digital Equipment DECchip 21152 PCI-PCI
Bridge (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
tlp0 at pci1 dev 4 function 0: DECchip 21143 Ethernet, pass 4.1
tlp0: interrupting at irq 25
tlp0: Ethernet address 00:80:c8:b9:7b:45
nsphyter0 at tlp0 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp1 at pci1 dev 5 function 0: DECchip 21143 Ethernet, pass 4.1
tlp1: interrupting at irq 25
tlp1: Ethernet address 00:80:c8:b9:7b:46
nsphyter1 at tlp1 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp2 at pci1 dev 6 function 0: DECchip 21143 Ethernet, pass 4.1
tlp2: interrupting at irq 25
tlp2: Ethernet address 00:80:c8:b9:7b:47
nsphyter2 at tlp2 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp3 at pci1 dev 7 function 0: DECchip 21143 Ethernet, pass 4.1
tlp3: interrupting at irq 25
tlp3: Ethernet address 00:80:c8:b9:7b:48
nsphyter3 at tlp3 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
-----

Poking around the kernel source, I found that there was some code to handle
"shared interrupts" in src/sys/dev/pci/if_tlp_pci.c that was enabled in the
quirk setup function for a couple other multiport boards (zynx and cogent).

After hacking my copy of if_tlp_pci.c to force this flag on and building a
new kernel, I've now been running with two ports enabled for much longer
than I ever have in the past.  My new dmesg looks like:

-----
ppb0 at pci0 dev 15 function 0: Digital Equipment DECchip 21152 PCI-PCI
Bridge (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
tlp0 at pci1 dev 4 function 0: DECchip 21143 Ethernet, pass 4.1
tlp0: interrupting at irq 25
tlp0: Ethernet address 00:80:c8:b9:7b:45
nsphyter0 at tlp0 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp1 at pci1 dev 5 function 0: DECchip 21143 Ethernet, pass 4.1
tlp1: sharing interrupt with tlp0
tlp1: Ethernet address 00:80:c8:b9:7b:46
nsphyter1 at tlp1 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp2 at pci1 dev 6 function 0: DECchip 21143 Ethernet, pass 4.1
tlp2: sharing interrupt with tlp0
tlp2: Ethernet address 00:80:c8:b9:7b:47
nsphyter2 at tlp2 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp3 at pci1 dev 7 function 0: DECchip 21143 Ethernet, pass 4.1
tlp3: sharing interrupt with tlp0
tlp3: Ethernet address 00:80:c8:b9:7b:48
nsphyter3 at tlp3 phy 1: DP83843 10/100 media interface, rev. 0
nsphyter3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
-----

Does this seem like a reasonable solution to this problem, or is this a
symptom of a larger problem with interrupt handing on this machine?

If this is the right thing to do, it's not clear to me how one would set up
a quirk to handle this properly.  (My hack just forces the flag on for all
chips the tlp driver handles, which is patently wrong but works in my case.)
The card looks like 4 generic 21143 ethernet chips and a generic 21152
pci-pci bridge, and I'm not sure how to differentiate this from a
motherboard using that bridge with 4 cards plugged in.  Is there code
somewhere that knows it's dealing with a multiport card?

Thanks,
-- monroe
------------------------------------------------------------------------
Monroe Williams                                         monroe@pobox.com