Subject: Re: New in_cksum/in4_cksum implementation
To: Steve Woodford <scw@wasabisystems.com>
From: Chris Gilbert <chris@dokein.co.uk>
List: port-arm
Date: 09/11/2003 10:12:03
On Thu, 11 Sep 2003 09:17:27 +0100
Steve Woodford <scw@wasabisystems.com> wrote:

> Hi folks,
> 
> I've been doing some Xscale optimisation work recently for Wasabi, 
> part of which involved re-writing in_cksum/in4_cksum in assembly. 
> While the resulting code is hand-crafted for Xscale, I've added the 
> necessary tweaks to support vanilla ARM cpus too. Thanks to Chris 
> Gilbert for useful feedback on that side of things.
> 
> Benchmark tests with a gigabit ethernet card show between 7% and 29% 
> improvement in throughput, depending on data size, compared to the 
> old code (using pkgsrc/benchmarks/nttcp). I don't have a figure for 
> regular ARM cpus, since I don't have an ARM board with fast enough 
> ethernet. I'd still expect to see a bit of improvement, though.
> 
> Wasabi would like to contribute this code back to NetBSD. If there are
> 
> no objections, I'd like to commit the attached code to the NetBSD 
> tree asap. I'd also like to see some figures from non-xscale machines 
> with decent ethernet. :)
> 
> Comments?

Something it has reminded me I was thinking of trying to do was add a
define along the lines of:
CAN_DO_HALF_LOADS
basically so that you can tell if ldrh's will work as expected.

Is this a sensible thing to do?

Chris