Subject: Re: 80Mbps routing with Micrel KS8695
To: Jesse Off <joff@embeddedarm.com>
From: Steve Woodford <scw@netbsd.org>
List: port-arm
Date: 01/17/2005 11:02:25
On Monday 17 January 2005 02:27, Jesse Off wrote:

> I notice the involuntary context switches are quite high at 265 / 20
> seconds.  IIRC, context switches are very expensive on arm due to the
> virtually-indexed L1 cache flushing.  Is the pagedaemon kernel thread
> the one needing to run this often and therefore competing with ttcp?

The cache/TLB are not flushed when context switching between a userland 
process and a kernel thread. The cache will be flushed only if switching 
to another userland process. The TLB will be flushed if switching to a 
different L1 page table.

> Changes I made:
>   * bypass most dmamap_sync() and use DMA_COHERENT mappings

For descriptor memory, this is actually a good idea on non cache coherent 
platforms anyway. Especially if individual descriptors are smaller than 
a cache line.

> Needless to say, I was a little disappointed to only get 350KB/s
> extra,

If you examine the object code for routines such as ip_input() et al, 
you'll see why. Nearly all the important network data structures have 
__attribute__((__packed__)) qualifiers, which causes gcc to emit 
bytewise loads/stores for all such structure members on architectures, 
like ARM, which don't support misaligned accesses.

The result is that we execute many more instructions than necessary Just 
In Case one of the structures might just happen to be misaligned...

Cheers, Steve