Subject: Re: But why?
To: None <,>
From: Chris Torek <torek@BSDI.COM>
List: tech-kern
Date: 10/23/1996 19:10:22
This message is something of a hodge-podge of thoughts.

Benchmarks are useful because they give you a consistent measure.
Benchmarks are harmful, however, when the measure they give you is
not a measure of `real' performance on `real' applications.
Unfortunately, `real' applications (a) vary from one person to
the next and (b) rarely work well as benchmarks.

One problem with optimizing system calls in general is that only
benchmarks spend a large fraction of time making repeated getpid()
calls, and speeding up such a benchmark is not useful.  On the
other hand, applications that are important to someone *do* spend
a lot of time making, say, read() or write() calls -- and making
getpid() faster also makes those faster.  The question (for which
I do not have the answer) is, how *much* faster, and should the
effort be put into the syscall stub, or into the path within the
file system read() call?  The time for a read() may turn out to be
dominated by byte copies that could be eliminated entirely via
page-mapping (e.g., replace the user's buffer pages with COW pages
that alias the buffer cache).

Van and I objected to the RPC-ish VFS interface that sits inside
Lite2-derived systems, but we lost that particular battle.  If
someone out there could measure the actual time-effect of that
interface vs a normal call interface, on some `realistic' benchmark
(about which one can also argue all day), that might be helpful.

>Alan Cox just devised a way for Linux/SPARC to avoid packet copying on
>our networking stack ...

This is not a micro-optimization.  (Neither, for that matter, is
the `system calls via normal subroutine calls' trick, although this
is probably not the place to *start* optimizing.)  In particular,
for applications that spend all their time sending bulk network
data, eliminating these copies eliminates the place they spend most
of their time -- a network send is, or should be, dominated by the
time spent copying those bytes.