Subject: Re: Data Alignment in mbufs
To: Curt Sampson <cjs@portal.ca>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-net
Date: 05/24/1997 00:43:02
Curt Sampson <cjs@portal.ca> writes:
[alignment restrictions on Alpha, do we want alignment guarantees on mbufs?]
Exceutive summary:
But what *kind* of alighment guarantees do you want? From whom to
whom? For whose benefit, the driver or the networking stack?
Or if you just want to avoid odd-byte-aligned mbufs, then find
the code that's generating them and fix it.
Gory detail:
Mips CPUs have similar alignment constraints. DEC's workstation
engineers designed DMA hardware which uses, uh, quaint and picturesque
DMA padding between 32- or 64-bit host memory and 16-bit devices like
the AMD LANCE.
It's actually pretty hard to engineer aligment guarantees for MI
networking code. What's your performance metric? What do you want to
optimize? What's the net win, including the cost of meeting the
alignment constraint?
Look at IP over Ethernet. On input, the mbuf chain for a packet has to
hold the MAC header. For Ethernet, that's src address, dst address,
ethertype: 14 bytes. The IP and TCP (or UDP) code is engineered to
have headers that are a multipel of 32 bits, and the networking code
assumes that the input mbufs it gets are suitably aligned. We might
like to DMA to/from 32-bit or even 64-bit--aligned chunks; but on
input, that'll immediatley cause IP and TCP to do a copy of their
headers; and on output, I think the 14-byte Ethernet header in front
of a 32-bit aligned IP/TCP or IP/UDP header is going to force another
copy.
I've instrumented (and tuned) if_le_iaosic code, which has to beween
mbuf chains and alternating 16-byte chunks of a DMA buffer. The vast
majority of the calls of the the calls end up are for odd-word-aligned
data. On a mips, that nearly doubles the cost of copying to/from the
static LANCE buffer.
A vanishingly small fraction, less than 1 in 10000, were for
odd-byte-aligned data. You'd like to eliminate those; I'd prefer to
eliminate the odd-word alignment which is caused by the ethernet
header. Given the 32-bit-alignment bug^H^H^H constraint on Tulip DMA,
and the extra copy that (apparently) costs the if_de driver, I'd guess
Matt Thomas might like to *relax* alignment constraints :)
There's no excuse for odd-byte-aligned mbuf data. If that's what
you're worried about, find out what's causing is to find out what's
causing the odd-byte-aligned mbufs, and and fix it so it doesn't do
that anymore. Add code to the copytodev part of your driver that
looks for odd-aigned mbuf data, and dump the entire chain if you find
any. Then figure out what protocol is generating the odd-byte data,
and then look at the code for that protocol.
My first guess is NFS xdr encoding. Most of the rest of the IP/TCP/UDP
code is careful to avoid such grossness.