Subject: Re: Real vfork() (was: third results)
To: None <tech-kern@NetBSD.ORG>
From: Eduardo E. Horvath <eeh@one-o.com>
List: tech-kern
Date: 04/07/1998 08:39:06
On Tue, 7 Apr 1998, Greg A. Woods wrote:

> [ On Mon, April 6, 1998 at 19:19:00 (-0000), jiho@postal.c-zone.net wrote: ]
> > Subject: Re: Real vfork()  (was: third results)
> >
> > > I remember doing exactly the opposite and tracing through the 0.9 and
> > > 1.0 sources to find out why vfork() wasn't doing as it was documented to
> > > do (and as I'd always observed it to do on 4.3BSD and even SunOS-4)!  ;-)
> > 
> > This is something I'm scratching my head about now.
> > 
> > The CSRG 4.4BSD book has murky grumbling about programs exploiting the vfork()
> > shared vmspace showing "bad programming practice", or some such.  That's the
> > only clue I've seen.
> 
> You mean the bit in section 5.6 where they discuss that even though
> vfork() is likely to remain more efficient than any copy-on-write scheme
> the CSRG folks took it out and simply implemented it as fork() because
> it is bad programming practice to allow a child process to modify the
> parent's address space?  Yes of course some programs did do this and
> naturally they broke when compiled on original 4.4BSD.
> 
> So, first people grumbled when vfork() was implemented (though not too
> loudly though since before copy-on-write it was a tremendous performance
> gain!), then they grumbled when they made vfork() behave ala fork()
> because they'd written in dependencies on vfork() semantics, and now
> some even grumble again when vfork() no longer behaves exactly as fork()
> and we're forced to have a magic wrapper so only newly compiled programs
> get the new/old vfork() semantics.
> 
> Unfortunately the following bug entry was removed from vfork(2) when the
> "proper" semantics were implemented.
> 
>   BUGS
>      This system call will be eliminated when proper system sharing mechanisms
>      are implemented.  Users should not depend on the memory sharing semantics
>      of vfork as it will, in that case, be made synonymous to fork.
> 
> I suppose whomever removed that paragraph will argue that there never
> will be proper system sharing mechanics in NetBSD, but I wouldn't be so
> quick to say that.  I will agree that it'll be unlikely for there ever
> to be something faster than vfork().
> 
> Even worse the new manual page tries to tell history stories and fails
> entirely to define the exact semantics of vfork().  (It also contains
> what seems to be a typo: s/paged/swapping/)
> 
> The fact that someone thought that the copy-on-write semantics were
> sufficiently "proper" in 4.4BSD to eliminate vfork() is quite telling.

I don't think that was the issue at all.  If you poke around through some
of the murky bits of the CSRG 4.4BSD book you will notice they claim to be
ready for threads (lightweight processes) and would have implemented them
if libc supported them.  They broke up the proc and u structures
precisely to support kernel threads.  They just didn't have time to
complete the process.  (No, I can't site page numbers 'cause I don't
have my book with me.)

<wild hypothesis mode>

I would presume the plan was to follow in the footsteps of Plan 9 and
provide a multi-graned fork() or separate klwp_fork() and va_fork() system
calls.  vfork() can be trivially implemented using some sort of lwp_fork()
and some annoying user-level lock/semaphore semantics, in theory.  In
practice you have the problem of an atomic unlock/exec operation.  

If you forget about locking and use plain klwp_fork() semantics you might
get an even larger perfomance gain over vfork().  In general, after a
fork()/vfork() the parent process will immediately wait().  With the old
fork() semantics you spend all that time generating a new process and fire
it up, but it may be preempted before it does the exec(), so you switch
back to the parent process, which does a wait, then you switch to the
child once again.  4 context switches.  With vfork() you block the parent
and switch to the child, the child exec()s, eventually you switch to the
parent, which wait()s and you switch back to the child.  1 lightweight
context switch (no address space change) and 3 normal context switches.
With klwp_vfork() (depending on the semantics) the child is created but
the parent continues executing until it wait()s, then the child starts up
and exec()s, resulting in only one lightweight context switch (and the
exec() itself which doesn't count).  

It would be interesting to finish the CSRG work and implement true kernel
threads to see how they work.  It would make real thread libraries easier
to implement and could result in a truly threaded kernel.

</wild hypothesis mode>

But that's all water under the bridge now that CSRG has been shut down.

=========================================================================
Eduardo Horvath				eeh@one-o.com
	"I need to find a pithy new quote." -- me