Subject: NIC driver interface to kernel.
To: None <>
From: Jochen Kunz <>
List: tech-kern
Date: 12/13/2003 21:31:26

I am working on a i82596 Ethernet driver. I know that this chip is
already supported by the i82586 driver by degrading the i82596 into
i82586 mode. But this driver doesn't work well, at least on hp700. The
i82596 is much smarter and can do real 32 bit DMA... I hope to get
better throughput and lower CPU usage with a real i82596 driver. Most
important, and this is the real purpose of this project: I will use the
knowledge I get during this project to extend my "NetBSD Device Driver
Writing Guide" with a chapter about NIC drivers.

I had a look at various network drivers, mostly tlp(4) and ie(4).
Reverse engineering existing code seams to be the only source of
information about kernel interfaces to NIC drivers beside altq(9) and
mbuf(9)... :-( At the moment I am trying to get a clue about the stuff
that "struct ifnet" provides / needs. 

E.g. there is a ifnet.if_output and ifnet.if_start that are used to put
packets out of the interface. Most drivers use only ifnet.if_start. Why?
What is the difference?

It seams that ifnet.if_start provides via ifnet.if_snd a queue of mbuf
chains, i.e. multiple packets? Allways chains, no clusters? 

What about if_start, if_stop, if_watchdog, etc. in "struct ifnet"? I can
only gues what they are supposed to do...

How is the data in the mbuf chain / cluster laid out? Is it raw data as
it has to be send over the wire? (I.e. it starts with soure- and
destination MAC addres.)

The i82596 does everything via DMA. It has one Frame Descriptor per
packet and multiple Data Buffer Descriptors per Frame Descriptor. The
Data Buffer Descriptors are a linear, linked list. Each Data Buffer
Descriptor points to a buffer in memory that can hold some data and the
length of the data in this buffer. The contents of all memory buffers
descibed by a Data Buffer Descriptor List is concatenated by the i82596
and represent the data of a single packet. This seams to fit well to the
concept of mbuf chains. 

For transmitting this would mean I can setup a Transmit Frame Descriptor
pointing to a list of Data Buffer Descriptors. The Data Buffer
Descriptors point to the data areas of the mbufs in the TX mbuf chain.
The mbufs are prepared via bus_dmamap_load_mbuf() for this.

The Frame Descriptors and Data Buffer Descriptors are preallocated at
device attachment and never released. This consumes about 32 kB of RAM.
Is preallocation of 32 kB OK for a device or should the descriptor lists
are allocated on demand? Other drivers seam to preallocate this lists at
device attachment too. 

My idea for receiving frames is to preallocate NRFD Receive Frame
Descriptors and NRDBD Receive Data Buffer Descriptors per NRFD. As a
mbuf can hold around 100 bytes (?) I choose NRDBD == 16, i.e. enough
space for one ethernet packet and NRFD == 64 i.e. 64 packets. When the
interface is brought up there will be NRFD * NRDBD mbufs preallocated
and the Receive Data Buffer Descriptors initialized to point to this
mbufs. This means that the driver will eat about 128 kB of RAM for RX
mbufs permanently. When a frame is received its mbuf chain is handled to
the upper protocoll layer and new mbufs are preallocated. 

Or should I use mbuf clusters for receiving frames? Should I allocate RX
mbufs on demand?

Thanks for your time.