NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-sparc64/46260: gem0 driver fails to recover after RX overflow



>Number:         46260
>Category:       port-sparc64
>Synopsis:       gem0 driver fails to recover after RX overflow
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-sparc64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 26 22:25:00 +0000 2012
>Originator:     Havard Eidnes
>Release:        NetBSD 6.0_BETA
>Organization:
        None
>Environment:
System: NetBSD betelgeuse.urc.uninett.no 6.0_BETA NetBSD 6.0_BETA (GENERIC) #1: 
Mon Mar 26 20:41:19 UTC 2012 
he%betelgeuse.urc.uninett.no@localhost:/usr/obj/sys/arch/sparc64/compile/GENERIC
 sparc64
Architecture: sparc64
Machine: sparc64
>Description:
        I've currently been upgrading a SunFire V120 from 4.0 via 5.1
        to 6.0_BETA.  The host sometimes gets significant traffic over
        gem0.  With the code in 4.0, it has been rock solid.

        However, both with 5.1 and 6.0_BETA, the gem(4) Ethernet interface
        tends to lock up.  Adding some debugging printf()s reveals that
        the errors which occur right before the interface seizes up is
        an RX overflow, the modified code is:

...
        if (status & GEM_INTR_RX_MAC) {
                int rxstat = bus_space_read_4(t, h, GEM_MAC_RX_STATUS);
                /*
                 * At least with GEM_SUN_GEM and some GEM_SUN_ERI
                 * revisions GEM_MAC_RX_OVERFLOW happen often due to a
                 * silicon bug so handle them silently. Moreover, it's
                 * likely that the receiver has hung so we reset it.
                 */
                if (rxstat & GEM_MAC_RX_OVERFLOW) {
                        ifp->if_ierrors++;
                        aprint_error_dev(sc->sc_dev,
                            "receive error: RX overflow");
                        gem_reset_rxdma(sc);
...

        And this printf() is triggered.

>How-To-Repeat:
        Push lots of traffic through gem0 with either 5.1 or 6.0_BETA.
        Watch it seize up.

>Fix:
        Doing an "ifconfig gem0 down; ifconfig gem0 up" resets the
        interface so that it works again for a while.



Home | Main Index | Thread Index | Old Index