Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Hardware checksums on bge(4) interfaces



        Hello.  Yes, this asymetry is intentional.  Both the Linux and FreeBSD
drivers behave this way.  My understanding is that the stack presents a
well formed packet to the driver for transmition to the Net, including a
calculated checksum.  My analysis of what I was seeing did not indicate
that we were generating packets with bad checksums, only that we were
discarding a very high percentage of incoming packets for failing the
checksum tests.  And, in fact, the known bug is that the BCM hardware
doesn't do the pseudo calculations on outbound packets.  There's anotefrom
2003 in our driver's history log which says that an assumption was made
which said that if the chip couldn't do the proper calculations on outbound
packets, then we shouldn't trust it to do the proper calculation on inbound
packets.  A fine assumption, except that over the course of time, no one
else has used this mode of operation, and, apparently, not tested it
either.
        So far, the machines running this patch look good.  I'm seeing a few
checksum failures, but given that these are name and mail servers which see
traffic from all over the Net, I'm not surprised to see some checksum
issues.  Before the patch, checksum counters were incrementing between 1
and 10 times a second.  Now, checksum errors are down in the noise and
matching my machines with different drivers and/or their hardware flags
turned off.

-Brian

        Here is the note I referenced above.  The last paragraph is relevant
to this discussion.  The chips apparently do do the pseudo header
calculation on packet reception, even if instructed not to.  I can't tell
if the Linux driver ever had to resort to not getting pseudo data on packet
reception, but neither the Linux or FreeBSD drivers indicate that this was
a problem on any chip revisions.


revision 1.46
date: 2003/08/22 03:32:35;  author: jonathan;  state: Exp;  lines: +93 -14
Check in hooks to fix checksum offload on bge devices. Empirical
observation is that some 570x devices can get themselves into a state
where they miscompute off-loaded TCP or UDP checksums on packets so
small that Ethernet padding is required.  Further obsevation suggests
that the bge checksum-offload hardware is adding those padding bytes
into its TCP checksum computation. (Once a 5700 gets in this state,
even a warm boot won't fix it: it needs a hard powerdown.)

Work around the problem by padding such runts with zeros: even if the
checksum-offload adds in extra zeros, the resulting sum will be correct.

Also, dont trust the checksum-offload on received packets smaller than
the minimum ethernet frame, in case the Rx-side has a similar bug.

Finally, on packets where we do trust the outboard Rx-side TCP or UDP
checksum, the bge did not include the pseudo-header. Set the
M_CSUM_NO_PSEUDOHDR bit as well as M_CSUM_DATA, and rely on
udp_input() or tcp_input() adding in the sum via in_cksum_phdr().
----------------------------
On Sep 15,  8:55am, Chris Ross wrote:
} Subject: Re: Hardware checksums on bge(4) interfaces
} 
} On Sep 15, 2010, at 3:47 AM, Brian Buhrow wrote:
} >     This patch changes the driver to instruct the hardware to =
} perform the
} > checksum over the entire packet, just as the FreeBSD and Linux drivers =
} do,
} > and to notify the upper layers appropriately. =20
} >=20
} > Let me know what you find.
} > -Brian
} 
}   The patch, left below, looks like it removes the =
} BGE_MODECTL_RX_NO_PHDR_CSUM bit, but leaves the =
} BGE_MODECTL_TX_NO_PHDR_CSUM bit in place.  By reading your description, =
} this may solve only half of the problem.
} 
}   I don't know, of course, but it looked worth asking about...  Did you =
} mean to leave the TX side without calculating the checksum of the =
} pseudo-header?
} 
}            - Chris
} 
} 
} >=20
} >=20
} > Index: if_bge.c
} > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
} =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
} =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
} > RCS file: /cvsroot/src/sys/dev/pci/if_bge.c,v
} > retrieving revision 1.152.4.2
} > diff -u -r1.152.4.2 if_bge.c
} > --- if_bge.c        2 Feb 2009 20:44:16 -0000       1.152.4.2
} > +++ if_bge.c        15 Sep 2010 07:12:43 -0000
} > @@ -1444,7 +1444,7 @@
} >      */
} >     CSR_WRITE_4(sc, BGE_MODE_CTL, BGE_DMA_SWAP_OPTIONS|
} >                 BGE_MODECTL_MAC_ATTN_INTR|BGE_MODECTL_HOST_SEND_BDS|
} > -               =
} BGE_MODECTL_TX_NO_PHDR_CSUM|BGE_MODECTL_RX_NO_PHDR_CSUM);
} > +               BGE_MODECTL_TX_NO_PHDR_CSUM);
} >=20
} >     /* Get cache line size. */
} >     cachesize =3D pci_conf_read(sc->sc_pc, sc->sc_pcitag, =
} BGE_PCI_CACHESZ);
} > @@ -3276,7 +3276,7 @@
} >                         cur_rx->bge_tcp_udp_csum;
} >                     m->m_pkthdr.csum_flags |=3D
} >                         (M_CSUM_TCPv4|M_CSUM_UDPv4|
} > -                        M_CSUM_DATA|M_CSUM_NO_PSEUDOHDR);
} > +                        M_CSUM_DATA);
} >             }
} >=20
} >             /*
} 
>-- End of excerpt from Chris Ross




Home | Main Index | Thread Index | Old Index