Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/trunk]: src/sys/dev/pci Add a table of Rx interrupt-threshold register v...



details:   https://anonhg.NetBSD.org/src/rev/58660ac1d7f3
branches:  trunk
changeset: 559541:58660ac1d7f3
user:      jonathan <jonathan%NetBSD.org@localhost>
date:      Sat Mar 20 02:04:07 2004 +0000

description:
Add a table of Rx interrupt-threshold register values for mitigating
Rx interrupts, functions to post a request for new table entries, and
code to apply pending Rx-interrupt control values at the next hardware
interrupt.

As used in a third-party proprietary tree since at least March 2003.
As discussed on tech-kern/tech-net in January 2004 (in the context of
NetBSD for packet capture, bpf, and FreeBSD-sylte IFF_POLL), and as
posted to tech-net for comments in mid-March 2004.

Still missing sysctl or other knobs to acutally change the config-time
values, due to my ignorance of any accepted per-device sysctl namespace.

diffstat:

 sys/dev/pci/if_bge.c |  106 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 104 insertions(+), 2 deletions(-)

diffs (141 lines):

diff -r 0674ba0ea830 -r 58660ac1d7f3 sys/dev/pci/if_bge.c
--- a/sys/dev/pci/if_bge.c      Sat Mar 20 01:58:51 2004 +0000
+++ b/sys/dev/pci/if_bge.c      Sat Mar 20 02:04:07 2004 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: if_bge.c,v 1.62 2004/03/20 01:58:51 jonathan Exp $     */
+/*     $NetBSD: if_bge.c,v 1.63 2004/03/20 02:04:07 jonathan Exp $     */
 
 /*
  * Copyright (c) 2001 Wind River Systems
@@ -79,7 +79,7 @@
  */
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: if_bge.c,v 1.62 2004/03/20 01:58:51 jonathan Exp $");
+__KERNEL_RCSID(0, "$NetBSD: if_bge.c,v 1.63 2004/03/20 02:04:07 jonathan Exp $");
 
 #include "bpfilter.h"
 #include "vlan.h"
@@ -125,6 +125,51 @@
 
 #define ETHER_MIN_NOPAD (ETHER_MIN_LEN - ETHER_CRC_LEN) /* i.e., 60 */
 
+
+/*
+ * Tunable thresholds for rx-side bge interrupt mitigation.
+ */
+
+/*
+ * The pairs of values below were obtained from empirical measurement
+ * on bcm5700 rev B2; they ar designed to give roughly 1 receive
+ * interrupt for every N packets received, where N is, approximately,
+ * the second value (rx_max_bds) in each pair.  The values are chosen
+ * such that moving from one pair to the succeeding pair was observed
+ * to roughly halve interrupt rate under sustained input packet load.
+ * The values were empirically chosen to avoid overflowing internal
+ * limits on the  bcm5700: inreasing rx_ticks much beyond 600
+ * results in internal wrapping and higher interrupt rates.
+ * The limit of 46 frames was chosen to match NFS workloads.
+ * 
+ * These values also work well on bcm5701, bcm5704C, and (less
+ * tested) bcm5703.  On other chipsets, (including the Altima chip
+ * family), the larger values may overflow internal chip limits,
+ * leading to increasing interrupt rates rather than lower interrupt
+ * rates.
+ *
+ * Applications using heavy interrupt mitigation (interrupting every
+ * 32 or 46 frames) in both directions may need to increase the TCP
+ * windowsize to above 131072 bytes (e.g., to 199608 bytes) to sustain
+ * full link bandwidth, due to ACKs and window updates lingering 
+ * in the RX queue during the 30-to-40-frame interrupt-mitigation window.
+ */
+struct bge_load_rx_thresh {
+       int rx_ticks;
+       int rx_max_bds; }
+bge_rx_threshes[] = {
+       { 32,   2 },
+       { 50,   4 },
+       { 100,  8 },
+       { 192, 16 },
+       { 416, 32 },
+       { 598, 46 }
+};
+#define NBGE_RX_THRESH (sizeof(bge_rx_threshes) / sizeof(bge_rx_threshes[0]))
+
+/* XXX patchable; should be sysctl'able */
+int    bge_auto_thresh = 0;
+
 int bge_probe(struct device *, struct cfdata *, void *);
 void bge_attach(struct device *, struct device *, void *);
 void bge_release_resources(struct bge_softc *);
@@ -189,6 +234,9 @@
 
 void bge_reset(struct bge_softc *);
 
+void   bge_set_thresh(struct ifnet *  /*ifp*/, int /*lvl*/);
+void   bge_update_all_threshes(int /*lvl*/);
+
 void bge_dump_status(struct bge_softc *);
 void bge_dump_rxbd(struct bge_rx_bd *);
 
@@ -544,6 +592,60 @@
 }
 
 /*
+ * Update rx threshold levels to values in a particular slot
+ * of the interrupt-mitigation table bge_rx_threshes.
+ */
+void
+bge_set_thresh(struct ifnet *ifp, int lvl)
+{
+       struct bge_softc *sc = ifp->if_softc;
+       int s;
+
+       /* For now, just save the new Rx-intr thresholds and record
+        * that a threshold update is pending.  Updating the hardware
+        * registers here (even at splhigh()) is observed to
+        * occasionaly cause glitches where Rx-interrupts are not
+        * honoured for up to 10 seconds. jonathan%netbsd.org@localhost, 2003-04-05
+        */
+       s = splnet();
+       sc->bge_rx_coal_ticks = bge_rx_threshes[lvl].rx_ticks;
+       sc->bge_rx_max_coal_bds = bge_rx_threshes[lvl].rx_max_bds;
+       sc->bge_pending_rxintr_change = 1;
+       splx(s);
+
+        return;
+}
+
+
+/*
+ * Update Rx thresholds of all bge devices
+ */
+void
+bge_update_all_threshes(int lvl)
+{
+       struct ifnet *ifp;
+       const char * const namebuf = "bge";
+       int namelen;
+
+       if (lvl < 0)
+               lvl = 0;
+       else if( lvl >= NBGE_RX_THRESH)
+               lvl = NBGE_RX_THRESH - 1;
+    
+       namelen = strlen(namebuf);
+       /*
+        * Now search all the interfaces for this name/number
+        */
+       TAILQ_FOREACH(ifp, &ifnet, if_list) {
+               if (strncmp(ifp->if_xname, namebuf, namelen) != 0 ) 
+                     continue;
+               /* We got a match: update if doing auto-threshold-tuning */
+               if (bge_auto_thresh)
+                       bge_set_thresh(ifp->if_softc, lvl);
+       }
+}
+
+/*
  * Handle events that have triggered interrupts.
  */
 void



Home | Main Index | Thread Index | Old Index