NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-sparc64/46260: gem0 driver fails to recover after RX overflow
The following reply was made to PR port-sparc64/46260; it has been noted by
GNATS.
From: Havard Eidnes <he%NetBSD.org@localhost>
To: gnats-bugs%NetBSD.org@localhost, jdc%coris.org.uk@localhost
Cc: port-sparc64-maintainer%netbsd.org@localhost
Subject: Re: port-sparc64/46260: gem0 driver fails to recover after RX
overflow
Date: Thu, 12 Apr 2012 09:47:19 +0200 (CEST)
> > + aprint_error_dev(sc->sc_dev,
> > + "receive error: RX no buffer space\n");
>
> I wonder if we should print out these extra diagnostics (we don't do=
it
> in other drivers). I think that we would be better off by updating =
extra
> counters, and using them as per the suggestions in "Using event coun=
ters
> with network interfaces, is there a reason they're all ifdefed out o=
f
> mainline use?" thread, starting at:
>
> http://mail-index.NetBSD.org/tech-kern/2011/12/10/msg012122.html
That's true, I added the printf()s only as a debugging aid to more
easily see if there was anything else going on which might trigger
the problem. I agree that doing printf()s on input errors is not
appropriate for production code.
> I think that the reason that you're seeing the extra resets is the d=
ifference
> between checking GEM_MAC_RX_OVERFLOW in gem_intr():
>
> > if (rxstat & GEM_MAC_RX_OVERFLOW) {
> > ifp->if_ierrors++;
> > + aprint_error_dev(sc->sc_dev,
> > + "receive error: RX overflow\n");
> > gem_reset_rxdma(sc);
>
> but checking GEM_MAC_STATE_OVERFLOW in gem_rx_watchdog():
>
> > + if ((state & GEM_MAC_STATE_OVERFLOW) =3D=3D GEM_MAC_STATE_OVERFL=
OW &&
Well, actually, no... If GEM_MAC_RX_OVERFLOW is flagged, it appears
that in my case a gem_reset_rxdma() is *not* sufficient to kick the
receiver back to life. Also, the GEM_MAC_STATE_OVERFLOW test in the
gem_rx_watchdog() function in my case never kicks in, so that's why
I added the extra code in the else clause, doing gem_init() there as
well, doing a full unconditional reset in the watchdog function.
Each and every time this has happened in my case, it's always been
the code in the else clause in gem_rx_watchdog() which has kicked
in.
> GEM_MAC_RX_OVERFLOW in gem_intr(). Maybe we can check GEM_MAC_STATE=
_OVERFLOW
> if GEM_MAC_RX_OVERFLOW is set and fire the callout only then. Alter=
natively,
> we might only need to check GEM_MAC_STATE_OVERFLOW (and not bother w=
ith
> GEM_MAC_RX_OVERFLOW at all).
The change I added is adapted from the OpenBSD driver, from this
diff:
http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/ic/gem.c.diff?r1=3D1.=
88;r2=3D1.89;f=3Dh
Do we have any documentation anywhere which douments the bit fields
in the GEM_MAC_MAC_STATE register? In my case I always read back
0x10400.
However, what worries me is the ease with which this problem can now
be triggered. It doesn't take particularly heavy network traffic to
make it happen. And, furthermore, this appears to be a regression
compared to the release I was running earlier, 4.0.1.
Regards,
- H=E5vard
Home |
Main Index |
Thread Index |
Old Index