Subject: Re: New in_cksum/in4_cksum implementation
To: Steve Woodford <email@example.com>
From: Chris Gilbert <firstname.lastname@example.org>
Date: 09/11/2003 10:12:03
On Thu, 11 Sep 2003 09:17:27 +0100
Steve Woodford <email@example.com> wrote:
> Hi folks,
> I've been doing some Xscale optimisation work recently for Wasabi,
> part of which involved re-writing in_cksum/in4_cksum in assembly.
> While the resulting code is hand-crafted for Xscale, I've added the
> necessary tweaks to support vanilla ARM cpus too. Thanks to Chris
> Gilbert for useful feedback on that side of things.
> Benchmark tests with a gigabit ethernet card show between 7% and 29%
> improvement in throughput, depending on data size, compared to the
> old code (using pkgsrc/benchmarks/nttcp). I don't have a figure for
> regular ARM cpus, since I don't have an ARM board with fast enough
> ethernet. I'd still expect to see a bit of improvement, though.
> Wasabi would like to contribute this code back to NetBSD. If there are
> no objections, I'd like to commit the attached code to the NetBSD
> tree asap. I'd also like to see some figures from non-xscale machines
> with decent ethernet. :)
Something it has reminded me I was thinking of trying to do was add a
define along the lines of:
basically so that you can tell if ldrh's will work as expected.
Is this a sensible thing to do?