tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Revamping optimised in_cksum/in4_cksum/in6_cksum support



Joerg Sonnenberger wrote:

> On Thu, Jan 10, 2008 at 04:12:08PM +0100, Joerg Sonnenberger wrote:
> > I would like to simplify this and implement in4_cksum and in6_cksum in C
> > and have a single MD backend function (md_in_cksum).
> 
> With the a small modification (cpu_in_cksum as MD backend), the patch
> http://www.netbsd.org/~joerg/cpu_in_cksum.diff implements this. The
> portable version was tested on Sparc and Alpha. i386 and amd64 are
> converted, other platforms will need some changes.
> 
> A regression test can be found in src/regress/sys/net/in_cksum.
> 
> The existing in_cksum implementation need to be adjusted to do the
> equivalent of the first for loop in the portable code.
> 
> I would like to continue with this without compat code, the MD work is
> relative straight forward. The MD versions should be timed, at least on
> Alpha the portable version is faster than the current MD code, I
> wouldn't be surprised if other platforms have similiar issues.

One part of this concerns me:

                while (mlen >= 32) {
                        __builtin_prefetch(data + 32);

On at least some MIPS and PowerPC implementations (and possibly others)
a prefetch on an address that isn't mapped can cause a bus error or
similar exception.  If you have an buffer that finishes on a page
boundary and the next page isn't mapped, you may trip up on this.

I'm not sure how to best deal with this.  Adding extra tests before
calling __builtin_prefetch may negate the benefits of that prefetch.

Simon.



Home | Main Index | Thread Index | Old Index