Re: Dealing with M_HASFCS for protocols that do not do ethernet crc

To: Chris <sekiya%netbsd.org@localhost>
Subject: Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Wed, 10 Aug 2022 16:47:59 +0700

    Date:        Wed, 10 Aug 2022 08:48:02 +0900
    From:        Chris <sekiya%netbsd.org@localhost>
    Message-ID:  <20220809234802.blogguklszwytk4m@reliant>

First, it is good that the real problem is now understood and fixed.

  | "Inside AppleTalk" lists the AppleTalk-Ethernet AARP packet format on page
  | 3-12.
  | It *does* show the data-link header and 802.2 header at the beginning of the
  | packet, but does not show FCS at the end.  The final four bytes are,
  | instead, four bytes of destination address.

That's because the FCS is not properly considered a part of the packet
(at the link layer interface).

It is not created by the sender when sending packets (weird super switch
exceptions apply) and is often (and in the past, essentially always) removed
by the receiving hardware when packets are received.   When using 802.3
format (with a length field, rather than a packet type) the length does not
include the FCS.   On the other hand, the rest of the frame (the header,
with the address, etc) is part of the frame, and is built, and consumed,
by host software, rather than the hardware.

  | Martin,  would this pass muster for a pullup to the 9.x branch?

I'm not Martin, and he has already replied, though I'm not sure that his
reply explicitly addressed that question (it says "Yes" but follows that
with more words that confuse things just slightly).

(I have nothing to do with RelEng so this isn't my call) but yes,
that is exactly the kind of thing that should be pulled up (to -8 as well
assuming that the driver exists there, and has the same issue).   This is
(was) a simple driver bug (caused by unfortunately typically lousy hardware
doc), and is the very thing that pullups are designed to handle.

One more thing, to go back to one of your earlier messages, still slightly
relevant because of a comment Martin made in his reply:

martin%duskware.de@localhost said:
  |  - a driver does not have to care about ethernet packet types (unless
  |    there are driver or hardware bugs, see above) 

Earlier, when justifying the earlier change methodology (which if it had
been the right thing to do, would have been fine incidentally) you said:

sekiya%netbsd.org@localhost said:
  | Existing examples of mucking about with the packet data were:
  | * in ixp425_if_npe.c (in npe_rxdone(), line 1050, which indicated that
  |   it was okay to modify the packets in-driver

Yes, it is, the driver has the option of stripping the FCS (if the
hardware has left it there) or of setting M_HASFCS instead.   Either
way works.

  |   {and is done unilaterally, which further
  |   indicates that this driver also doesn't work with AppleTalk}),

It doesn't mean that, but that doesn't matter now.

The more relevant example, related to Martin's comment, is this one:

  | * in dev/ic/gem.c, gem_rint(), around line 2794, where the gem driver mucks
  |   about with the checksum based on ETHERTYPE_*, which indicates that drivers
  |   are allowed to modify packets based on packet type. 

This is a whole different issue.  Much modern ethernet hardware has realised
that almost all the packets they send & receive are IP packets, and those
have both a header checksum (for IPv4 anyway) and a payload (TCP or UDP
normally) checksum as well.    Calculating those checksums when sending,
and verifying them on reception, means (for the payload checksums anyway)
reading the entire packet (and doing arithmetic on what is found there,
but that's not generally significant).   That has to be done in kernel,
because the packet contents can't be trusted otherwise, and normally, is
the only reason the kernel has to look at the actual packet data - normally
it simply comes from the user, gets headers prepended, is sent, the receiver
drops the headers, then hands the packet data to the receiving application.

On the other hand, the ethernet adaptor cannot avoid processing every
single byte in the packet, both for incoming, and outgoing packets - it
has to fetch the packet from memory (often to an internal buffer, but that
doesn't matter) and transmit the bits as voltages on the appropriate pins
(and of course, the same thing in the opposite direction when receiving).

Someone (I have no idea where this idea originated) worked out that the
since the ethernet adaptor is reading the packet, the whole packet, it can
easily calculate the IP/TCP/UDP checksums, setting them on transmit, and
verifying them on receive.

But only for IP packets - Appletalk (Ethertalk), ISO, X.25, ... which all
have defined framing over ethernet/802.3 networks, don't contribute enough
packets (to the world, whatever they may be on one particular network) to
make it worth bothering with any of those.   (The chip vendors aren't going
to get enough extra sales to justify the expense).

That's what you observed in gem.c - dealing with the effects of that 
processing.   I'm not sure why the driver needs to - normally I'd have
thought that it would simply advertise the capability to do this up the
stack, and then if enabled, cause the hardware to do it on send &/or receive,
and on receiving, simply set the "checksum verified" bit(s) to inform the
higher layers of the stack they don't need to repeat this.   But I know
nothing about the gem driver, or the hardware it deals with, to be able to
say that it is doing something it shouldn't.   Assuming that the code it
is running is required, then it certainly can only apply to IP packets,
not anything else.

kre

ps: while hardware TCP/UCP (and IP) checksum offload seems like a great
way to improve network/system performance, and is for end stations, it
is absolutely the wrong thing to do on a router (or bridge, or anything
else which receives then retransmits frames).   These hardware checksum
set and check, check, and then often, recompute and replace, the TCP (etc)
checksum, every time the packet is forwarded - meaning that any packet
corruption that happens between when the packet arrives from the cable,
and when it is sent again (which includes the DMA to/from host memory,
and anything - either hardware or software issues - which might happen
while the packet is temporarily resident in RAM) will not be detected by
those checksums (this is also why, packet processing code that needs to
adjust the packet should always adjust the checksum to compensate, never
simply recompute it - however much easier to the code writer doing the
latter might seem to be.  That is, we don't adjust the checksum rather
than recomputing because it is computationally cheaper, which it almost
always is, but because recomputing it is wrong.

References:
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Chris
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Martin Husemann
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Robert Swindells
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Martin Husemann
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Martin Husemann
- Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
  - From: Martin Husemann

Prev by Date: Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
Next by Date: [PATCH] Defer if_slowtimo to wq; make mii_down wait for mii_phy_auto
Previous by Thread: Re: Dealing with M_HASFCS for protocols that do not do ethernet crc
Next by Thread: [PATCH] Defer if_slowtimo to wq; make mii_down wait for mii_phy_auto
Indexes:

Home | Main Index | Thread Index | Old Index