netbsd-help: Re: got drivers?

<netbsd@sopwith.solgatos.com>

Dieter wrote:
> In message <20050123142237.GA3535@antioche.eu.org>, Manuel Bouyer writes:
[ ... ]
> I added some debugging output, and it appears to me that the driver thinks my
> boards don't have store and forward mode:
> 
>      de1: abnormal interrupt: transmit underflow csr=0x20 mask=0x20
>      sc->tulip_flags=0x8140100 sc->tulip_cmdmode=0x2e022 sc->tulip_features=0x3200
> 
>>BTW, if you're using the de driver, I guess you're
>>running a quite old release. Maybe the new tlp driver would switch to
>>store and forward mode automatically in such situation.
> 
> I'm running 1.6.2.  The tlp driver was at least as bad, and seemed worse.
> It seemed like more windows timed out and went away with tlp.
> With de, windows freeze up awhile, but usually recover after the wd i/o stops.
> Windows only rarely time out and go away with de.

Between these two remarks, I would start to wonder whether one of your cards 
is failing, or whether you've got some kind of PCI interrupt contention or 
storm causing problems.

> Is there a "PCI latency for dummies" README/HOWTO/FAQ somewhere?

Well, the following is apropos:

-- 
-Chuck

-------- Original Message --------
Subject: Re: NIC card problems....
Date: Mon, 24 Jan 2005 00:27:38 +0100
From: Stefan Eßer <se@FreeBSD.org>
To: Net Virtual Mailing Lists <mailinglists@net-virtual.com>
CC: freebsd-stable@freebsd.org
References: <20050123135753.14384@mail.net-virtual.com>

On 2005-01-23 05:57 -0800, Net Virtual Mailing Lists 
<mailinglists@net-virtual.com> wrote:
 > My latest problem is with a:
 >
 > dc0: <ADMtek AN985 10/100BaseTX> port 0xe800-0xe8ff mem 0xe6080000-
 > 0xe60803ff irq 11 at device 10.0 on pci0
 > dc0: Ethernet address: 00:0c:41:ed:ae:85
 >
 > ...  after several hours of *HEAVY* (I'm probably understating this)
 > utilization I get:
 >
 > dc0: TX underrun -- increasing TX threshold
 > dc0: TX underrun -- increasing TX threshold
 > .. repeats numerous times..
 > dc0: TX underrun -- using store and forward mode

Well, that's nothing to worry too much about ...

The device has a data FIFO that is filled with data words fetched
by its bus-master DMA engine via the PCI bus. As an optimization,
sending the Ethernet frame may optionally start, before all data
for the frame has been put into the FIFO. Normally, data is fetched
faster by DMA then sent by Ethernet, but if there are a significant
number of simultanous PCI data transfers by other devices, there is
the risk that the FIFO runs out of data (buffer underflow). In such
a case, the current Ethernet frame can't be finished (dummy data and
a bogus CRC are added to have the receiving party discard the frame).

If such a sittuation exists, the driver will increase the amount of
data required in the FIFO, before transmission of the next Ethernet
frame starts. After multiple underruns, the driver will configure
the Ethernet chip to buffer the full contents of each frame (store
and forward mode) and will avoid the early transmission start, since
the hardware apparently is not capable of providing data fast enough.

 > .. at this point the system simply reboots.  I have attempted to apply a

Then there definitely is a bug somewhere, but not neccessarily in the
dc driver. Instead of switching Ethernet cards, you may want to check
the PCI performance of your mainboard. The PCI latency timers (master
latency timer and individual timers in each device) may play a role.

They decide about the maximum "time slice" assigned to each bus-master
in turn. Too small a value causes the PCI bus throughput to suffer
(because of a few PCI bus clocks are lost each time the next bus-master
takes over), while too high a value may cause a device to starve waiting
to get access to the bus granted.

A master latency timer value of 32 (0x20) should keep the bus-master
switch overhead down to 20% (i.e. 80% left for data transfers) and
should keep the latency in the range of 1 microsecond per bus-master
(i.e. 5 microseconds if there are 2 Ethernet cards, 2 disk controllers
and one host bridge active at the same time). In that case, each PCI
device could expect to transfer 100 bytes every 5 microseconds. A
buffer of 128 bytes ought to suffice for a fast Ethernet card, in
that case.

But this is a simplified view and calculation. Devices may keep the
PCI bus longer than the granted time slice, if there are (what used
to be called) "wait cycles" inserted by slow devices. (And some sound
cards have been reported to cause high PCI load, even when idle.)

The TX threshold messages issued by the dc driver appear more as an
indication that the PCI bus is under severe load, than as a hint that
the dc driver is causing the reboots, IMHO.

Regards, STefan
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"