tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: BNX driver problem when mbuf clusters run out

On Apr 20, 2012, at 9:45 AM, Matt Thomas wrote:
> The patch has been updated a bit to include a missing bus_dmamap_sync

I do believe I have tracked down the problem. I have not added the 
bus_dmamap_sync to my patch since Matt already got that one.


Here's my description:

  Problem 1:
  In bnx_rx_intr, there are two places where the producer can get pushed
  ahead of the consumer.  In the while (sw_cons != hw_cons) loop,
  there is a check for a packet whose layer 2 data has been corrupted,
  and a check for being able to allocate a new mbuf cluster to be
  placed in the receive queue.  In both cases, if the test fails,
  the mbuf cluster that has just been removed from the receive ring
  buffer is put back, and the producer index is pushed forward.  However,
  the consumer index is not pushed forward, so the next time around,
  the mbuf cluster is removed from where the consumer points, but a
  new (or recycled) mbuf cluster is put where the producer points.  This
  over writes the pointer to another mbuf cluster.  The mbuf cluster is
  lost forever, and the receive buffer ends up filled with NULLs because
  the consumer and produce indices stay out of sync.  Eventually all
  mbuf clusters are lost, and no more communication can happen over any

  To fix this, I changed the action taken in each test in case of failure.
  For the botched layer 2 data, I advance the consumer pointer, and then
  break out of the loop.  For the failure to get a new mbuf cluster,
  I removed the call to recycle the mbuf and allow the packet to be
  processed rather than end up dropped on the floor.  This now prevents
  live lock on the interface because incoming packets can still be
  processed even if there are no mbuf clusters to be had.  And because
  the packet is processed, the consumer pointer is pushed forward.
  The producer pointer stays put, because no mbuf has been put into
  place in the receiver ring buffer.

  In addition, I put a KASSERT in bnx_add_buf to check that no mbuf
  cluster are being overwritten in the receive ring.

  These fixes stopped the lossage of mbuf clusters.

  Problem 2:
  bnx_get_buf relies on the counter free_rx_bd to determine whether it
  should obtain a new mbuf cluster for the receiver ring.  bnx_rx_intr
  increments this counter before checking if there was an mbuf available
  to be removed.  Thus, this counter could end up too high, causing
  bnx_get_buf to obtain too many mbuf clusters.  Once again, we can
  leak mbuf clusters.  I moved the increment to the place where the
  mbuf cluster is actually removed from the receive ring.

  Problem 3:
  It is possible for the receive ring to completely lose all of its
  mbuf clusters in the receive ring.  If this happens, the driver is
  dependent on bnx_tick to call bnx_get_buf to refill the ring.
  bnx_tick did that, but then forgot to tell the chip about it.
  So the chip thought it didn't have any buffers to fill, so it waited.
  bnx_intr never got an indication that there were new packets to
  read because the chip wasn't loading them, and thus, bnx_rx_intr
  would never be called.  This caused an individual interface to
  freeze, but taking the interface down and up would fix the problem
  since it would resync the cpu and the chip.  Adding calls to
  resync the chip after bnx_tick succeeds in getting new mbuf clusters
  fixes the problem.

Here's my patch:

diff --git a/netbsd/src/sys/dev/pci/if_bnx.c b/netbsd/src/sys/dev/pci/if_bnx.c
index 711ec34..6fc815d 100644
--- a/netbsd/src/sys/dev/pci/if_bnx.c
+++ b/netbsd/src/sys/dev/pci/if_bnx.c
@@ -3730,6 +3730,7 @@ bnx_add_buf(struct bnx_softc *sc, struct mbuf *m_new, u_in
         * last rx_bd entry to that rx_mbuf_ptr and rx_mbuf_map matches)
         * and update counter.
+       KASSERT(sc->rx_mbuf_ptr[*chain_prod] == NULL);
        sc->rx_mbuf_ptr[*chain_prod] = m_new;
        sc->rx_mbuf_map[first_chain_prod] = sc->rx_mbuf_map[*chain_prod];
        sc->rx_mbuf_map[*chain_prod] = map;
@@ -4346,7 +4347,6 @@ bnx_rx_intr(struct bnx_softc *sc)
                /* Get the used rx_bd. */
                rxbd = &sc->rx_bd_chain[RX_PAGE(sw_chain_cons)][RX_IDX(sw_chain_
-               sc->free_rx_bd++;
                DBRUN(BNX_VERBOSE_RECV, aprint_error("%s(): ", __func__); 
                bnx_dump_rxbd(sc, sw_chain_cons, rxbd));
@@ -4393,6 +4393,7 @@ bnx_rx_intr(struct bnx_softc *sc)
                        /* Remove the mbuf from the driver's chain. */
                        m = sc->rx_mbuf_ptr[sw_chain_cons];
                        sc->rx_mbuf_ptr[sw_chain_cons] = NULL;
+                       sc->free_rx_bd++;
                         * Frames received on the NetXteme II are prepended 
@@ -4445,7 +4446,8 @@ bnx_rx_intr(struct bnx_softc *sc)
                                        panic("%s: Can't reuse RX mbuf!\n",
-                               continue;
+                               sw_cons = NEXT_RX_BD(sw_cons);
+                               break;
@@ -4459,18 +4461,6 @@ bnx_rx_intr(struct bnx_softc *sc)
                                DBRUN(BNX_WARN, aprint_debug_dev(sc->bnx_dev,
                                    "Failed to allocate "
                                    "new mbuf, incoming frame dropped!\n"));
-                               ifp->if_ierrors++;
-                               /* Try and reuse the exisitng mbuf. */
-                               if (bnx_add_buf(sc, m, &sw_prod,
-                                   &sw_chain_prod, &sw_prod_bseq)) {
-                                       DBRUNIF(1, bnx_breakpoint(sc));
-                                       panic("%s: Double mbuf allocation "
-                                           "failure!",
-                                           device_xname(sc->bnx_dev));
-                               }
-                               continue;
                        /* Skip over the l2_fhdr when passing the data up
@@ -5615,8 +5605,12 @@ bnx_tick(void *xsc)
        prod_bseq = sc->rx_prod_bseq;
        chain_prod = RX_CHAIN_IDX(prod);
        bnx_get_buf(sc, &prod, &chain_prod, &prod_bseq);
-       sc->rx_prod = prod;
-       sc->rx_prod_bseq = prod_bseq;
+       if (sc->rx_prod != prod || sc->rx_prod_bseq != prod_bseq) {
+               sc->rx_prod = prod;
+               sc->rx_prod_bseq = prod_bseq;
+               REG_WR16(sc, MB_RX_CID_ADDR + BNX_L2CTX_HOST_BDIDX, prod);
+               REG_WR(sc, MB_RX_CID_ADDR + BNX_L2CTX_HOST_BSEQ, prod_bseq);
+       }

Home | Main Index | Thread Index | Old Index