tech-kern: Re: Real vfork() (was: third results)

Subject: Re: Real vfork() (was: third results)
To: None <tech-kern@NetBSD.ORG>
From: Greg A. Woods <woods@most.weird.com>
List: tech-kern
Date: 04/09/1998 20:00:05
[ On Thu, April 9, 1998 at 13:38:06 (-0700), Jason Thorpe wrote: ]
> Subject: Re: Real vfork() (was: third results) 
>
> Lo and behold, it was still MUCH faster.  There are a couple of reasons
> for this:
> 
> 	(1) Copying the VM map and pmap entries is overhead.
> 
> 	(2) Most (all?!) ports don't actually implement pmap_copy(), so
> 	    they still have the page fault overhead when the child runs
> 	    again.
> 
> Basically, it's STILL wasteful if you're going to just fork/exec (the
> exec simply unmaps the address space so it can map in the new program).

But what's the ratio of overhead between these two?  Any time I see two
variables hidden under a common total, esp. when one could be shrunk or
even eliminated by a better or more complete implementation, I think
it's very important to figure this out instead of just sweeping it all
under the rug of past practices and esp. in this case leave the burden
of "doing it right" on the user-level code.

In addition, would implementing pmap_copy() help any with overall system
performance even with vfork()?  (I.e. would it have any positive impact
on an ordinary fork() that's not followed by an exec(), such as when the
shell spawns to execute a built-in?)

> So, we figured that re-enabling the old vfork(2) semantics would be
> a win.  Sure, it's a speed hack, but it's a speed hack that's been around
> for a fairly long time, and, if one knows the constraints of the interface,
> and how to use it correctly, it can be quite effective.

I've always wondered why something akin to the infamous spawn() wasn't
implemented in light of the dangers of vfork().  If we're going to give
up on making COW work nearly as efficiently for fork()+exec() as for
vfork()+exec(), then why don't we give in and follow the lead of half
the other operating systems on the planet and implement one system call
that combines both operations?  We don't have to give up fork() and
exec() [as some other OS' did]....

The behaviour of a spawn() interface is much clearer, and infinitely
safer, than the current use of vfork()+exec(), and we'd save at least a
kernel-2-user and user-2-kernel switch as well!  Of course spawn() would
require a fairly complex API to be truly useful, but that shouldn't be
very difficult to come up with, esp. since we already have a fair
variety of programs that successfully use vfork() to analyze for
requirements and ideas.  Naturally one could of course write an
implementation of spawn() at the user level with either/both of
vfork()+exec() and fork()+exec(), and thus ensure better portability for
applications that used it, and to attempt to hide the dangers of vfork().
Such an implementation may even already exist.

It may not be as elegant as the separate lower primitives of fork() and
exec(), but if we're going to talk performance then we can give up at
least a wee bit of elegance, but we shouldn't have to put up with a
potentially dangerous interface at same time.

We could even implement fork() as a special case of (or macro wrapper
arround?) a Plan-9 like rfork() and thus have the full gamut of process
creation primitives from the most primitive to the most sophisticated.
Didn't someone already implement rfork() for NetBSD (though I guess it
would need to be re-done for UVM)?

-- 
							Greg A. Woods

+1 416 443-1734      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>