tech-kern: Re: But why?

Subject: Re: But why?
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Larry McVoy <lm@neteng.engr.sgi.com>
List: tech-kern
Date: 10/24/1996 13:53:35
:  > Besides, we do not measure getpid (), we measure read()ing 1 byte from
:  > a null device.
: 
: 	- When you take a measurement of this test on different
: 	  types of OSes, you're not necessarily measuring the
: 	  same thing.  Not only are you not measuring simple
: 	  system-call overhead (which is what this whole
: 	  thing is about anyhow, right?), but different systems
: 	  may have different layering techniques which produce
: 	  `poor' results.

I'll handle this one, since I made that decision.  Many things that happen
in Unix happen in the context of a file system.  The intent of the benchmark
was to measure how long it takes to get to the point that you could start
doing something useful in the OS, i.e., you are in nfs_vnodeops.c for
example.  I should have called it the null I/O benchmark.

I didn't use getpid() because that can be cached.  Someone suggested gettpid()
and I agree that is a good null system call.

: 	- Even if the results are `poor' (which, in this crowd,
: 	  is defined as `slower that Linux', I guess), your
: 	  measurement may not be an indication of deterimental
: 	  impact on the system overall.

That benchmark is remarkably similar to what stat, readdir, a small read
from the buffer/page cache, etc., do.  Systems that do well on that benchmark 
will also do well on things like repeated stats (find, NFS client, etc).

: 	- Just who reads one $#^% byte, anyhow?  System calls like
: 	  that are typically `optimized' for larger transactions,
: 	  like MAXPHYS (64k under BSD).

Yeah, that's true.  It's also a bummer.  It's cool to go fast with large 
requests, I know that, SGI knows that, it's what my part of SGI does for
a living.

It's much harder to make small transactions go fast.  Unfortunately,
while benchmarks typically show off the first case, they rarely
measure the second.  My intent with lmbench was to get latency some
much needed respect.

Also, it wasn't one byte, it was 4 :-)

: So, you're not testing the more common case.  Now, if you were to
: test the common case, using a `NULL device' (I'm assuming /dev/null)
: isn't fair, because to implement it, all you have to do is set resid
: to 0 and return.  Indeed, to actually measure read() you have to copy
: something to the user's address space.

Yeah, if /dev/zero was everywhere, the benchmark would be 

read(devzero, buf, 4);