Subject: Re: Intel i82547 performance problems in wm(4)
To: Jonathan Stone <>
From: Bill Studenmund <>
List: tech-net
Date: 07/16/2004 15:21:36
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jul 16, 2004 at 11:57:53AM -0700, Jonathan Stone wrote:
> In message <>,
> Bill Studenmund writes:
> >My personal experience with an application that spends a lot of time=20
> >sending data from disk over a TCP socket is that checksum offload (IP an=
> >TCP) do in fact make a big difference. It helps both with saved cycles=
> >(CPU can do other things) and with reduced cache usage (as the data don'=
> >have to be reloaded into the Dcache to get checksummed). If there is=20
> >something else the CPU can be doing, then the offload support lets it=20
> >effectively do two things at once.
> Bill, how much is "big"? 10%-15%, or much more than that?

More than that.

> Having spent many years and years measuring this and similar
> effects, with tools from ttcp throughput down to staring at PCI
> bus-analyzer traces:
> On modern machines, the *real* case when you get a significant win
> from TCP checksum offload, is when computing the TCP checksum is the
> only time the CPU actually touches the data.  In that case, moving the
> software TCP checksum to outboard hardware means the CPU *never* has
> to see or touch the TCP payload; you eliminate the off-chip activity
> necessary for the CPU to gain cache-line ownership of the I/O buffers
> holding the data.  That's usually a bigger win than the savings from
> checksumming data the CPU already touch recently (e.g., refetching a
> cache line from L2 cache to L1/registers).

Ahhh... Interesting analysis. My app is one of the ones where the only=20
time the CPU touches (most of) the data is when it's checksumming. :-)

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.2.3 (NetBSD)