Subject: Re: Recommendations wanted for 100baseTX cards
To: None <darkstar@pgh.net>
From: Bill Paul <wpaul@ctr.columbia.edu>
List: port-i386
Date: 01/31/2000 10:57:48
Of all the gin joints in all the towns in all the world, Matthew Orgass
had to walk into mine and say:
> On Sun, 30 Jan 2000, Bill Paul wrote:
>
> > Hm. What isn't working exactly? I didn't have too much trouble getting it
> > to work on FreeBSD (except for the transmit corruption problem due to the
> > broken scatter/gather DMA). One of the things I was told during my exchanges
> > with the Davicom people is that the DM9102A requires the TX and RX
> > descriptors to be aligned on 16 byte boundaries (4 DWords alignment, as
> > they put it). It's possible the DM9102 has the same restriction.
> >
> > Hard to say what could be going wrong without more details though.
>
> The symptoms are:
> 1) It never sends a transmit complete interrupt, only early transmit
> interrupts (if requested, if not it does nothing).
Hm. Something else I never noticed. In my driver, I handle both the
transmit complete and no transmit buffer available condition as the same
thing (all pending TX descriptors have been sent, ditch all the mbufs).
One of these must be firing otherwise I'd be running out of mbufs
really quick.
It looks like Davicom's own driver for Linux doesn't even bother to check
any of the 'something transmit-related happened' bits in the status
register. Their interrupt handler just looks to see if there are any
transmissions pending at all, and then goes through them and tests the
owner bits in the descriptors to see if it's safe to reclaim them.
> 2) It *does* clear the owner bit on the setup packet.
Uh... but not on any of the transmitted frames?
> 3) It fails to idle with transmit status RUNNING-CLOSE-CLEAR OWNER and
> receive status RUNNING-WAIT.
> 4) The SROM reports bogus GPR media
Yeah, well, I never pay attention to the SROM anyway. But that's another
story.
I gather than that the main problem is that it doesn't transmit. (Since
you said nothing about receive, I'll assume that works.) I don't have the
tlp driver in front of me so it's hard to say what could be wrong, but
I can tell you what I'm doing that seems to be working:
Initialization:
- Do a global reset
- Set CSR0 to 0, which Davicom claims is necessary since any other
setting may lead to instabilities.
- Set the TX threshold to the minimum (best performance) value.
- Set 'no RX CRC' bit.
- Initialize descriptor queues. I use chained descriptor mode only,
since the same driver needs to support the ASIX chip which only
works in chained descriptor mode. The descriptors are allocated
as a single chunk starting on a page boundary (using contigmalloc())
and split up into two arrays, one for TX and one for RX. Each
descriptor array has a companion 'chain' array which contains
the virtual mbuf pointers and a few other sundry things. The 'chain'
arrays don't need to be contigmalloc()ed. Since the descriptor
memory is page aligned, the descriptors themselves are also aligned
on 16-byte boundaries, like Davicom suggests.
(This is not how I did it originally: I defined my own descriptor
structure with the chip's descriptor structure on top and the mbuf
pointers appended to that, rounded out to some sensible size. This
is *supposed* to be ok for chained descriptor mode since the the
descriptors can be anywhere as long as they don't get split over a
page boundary. However I really wanted to see what difference it
would make if I used fixed length ring mode since that allows you
to use to fragment pointers per descriptor. But I had a lot of trouble
with some chips trying to define a non-zero skip length, so I decided
to throw out the custom structure definition and just make two sets
of arrays, which would end up using the same amount of memory anyway.
Then I discovered that fixed length ring mode didn't really seem to
help performance, so I went back to chained mode but left the descriptor
allocation alone.)
- Load the physical addresses of the heads of the RX and TX descriptor
rings into 'RX ring base address' and 'TX ring base address' registers.
- Enable interrupts.
- Set the 'TX on' bit in CSR6 to enable the transmitter.
- Load the RX filter. This consumes the first descriptor in the TX
ring and leaves the chip looking at the second descriptor, which is
where the first packet will go. I program the receiver in 'hash
perfect' mode (one perfect filter entry for the station address plus
the 512-bit multicast hash table). I also set the bit in the hash
table that relates to the broadcast address to enable reception of
broadcast frames. If the IFF_PROMISC flag is set, I set the promiscuous
mode bit in CSR6, otherwise I clear it. If the IFF_ALLMULTI bit is
set, I set the 'receive all multicasts' bit in CSR6, otherwise I
clear it.
- Set the 'RX on' bit in CSR6 to enable the receiver.
- Write a value to the 'RX DMA start' to start the RX process going.
Performance is not stellar due to the transmit buffer coalescing, but
gets off the ground pretty reliably.
-Bill
--
=============================================================================
-Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu
Work: wpaul@ctr.columbia.edu | Department of Electrical Engineering
Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
"Mulder, toads just fell from the sky!" "I guess their parachutes didn't open."
=============================================================================