Subject: Re: new vfork(2) implementation
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Computo Ergo Checksum <greywolf@starwolf.starwolf.com>
List: current-users
Date: 01/03/1998 20:07:19
Jason Thorpe sez:
/*
 * Hi folks,
 * 
 * Well, with NetBSD 1.3 almost out the door (the announcement should be
 * any time now..), it's time to start thinking about running -current again
 * (well, for some of us :-)
 * 
 * ...  This new implementation will have the
 * original 3BSD vfork semantics: a completely shared address space as well
 * as parent blocks for child exit (contrast that to the current NetBSD vfork
 * that came from 4.4BSD, which simply blocks the parent).
 * 
 * Orignally, vfork was implemented because the address space was copied at
 * fork time.  This was a time consuming process, and a real waste if the
 * program were to immediately exec (which would then unmap the address space
 * so painstakenly copied just moments ago).  When copy-on-write was added
 * to BSD (when it got the Mach VM system)...

...which appears to be, in itself, something of a travesty...

 * ...the address space was no longer
 * copied.  Instead, the VM map entries were copied, and the pages set for
 * COW (by calling into the pmap layer to write-protect the pages).  This
 * greatly lessened the need for vfork(2), and vfork(2) was changed to merely
 * perserve the previous synchronization semantics.
 * 
 * However, copying the VM map entries, and invoking the pmap to set the
 * pages for COW is still somewhat expensive, especially for large processes
 * (which have many VM map entries, and physical pages to protect).  Also,
 * the previous shared-memory semantics were lost (which were less important,
 * since not very many programs relied on this feature).

[statistics deleted]

 * 
 * I will be allocating a new system call number and using function versioning
 * to add this call, so that old binaries that don't follow the vfork rules
 * will continue to work.  Those rules, for those who don't know, are:
 * 
 * 	(1) Must be very careful with local variables, since the child
 * 	    and the parent may end up sharing them.
 * 
 * 	(2) In the child, never call "return" from the context where the
 * 	    vfork occured.  It will trash the parent's stack.
 * 
 * 	(3) In the child, never call exit(3) (or anything that calls exit(3)),
 * 	    because it will run the parent's exit-time cleanup functions,
 * 	    modifying the parent's address space.  Use _exit(2) instead.
 * 

For those who aren't too familiar with the semantics of vfork(2) --
or, rather, how vfork() was *supposed* to work in the first place --
the first place any potential problems in version differences will
appear will likely be any of the shells.

 */

I would like to extend my congratulations and a raucous, jovial, jesting
"It's about bloody _time_!" to Jason for undertaking vfork().  It's not
easy, but it's certainly long overdue.

Hey, Jason, could you possibly do a fork() vs. vfork() stat comparison
to see how the times stack up?  The difference appears to be something
in the area of 20-30% for old vs. new vfork(), and i'd be interested in
seeing the savings versus a real fork()...



				--*greywolf;
--
				-=*=-
	"Did her eyes at the Turn of the Century tell me plainly
	 How we'll meet, how we'll love?  Oh, let life so transform me..."
		-- Anderson/Howe/White
 *YES*RUSH*police*genesis*peter_gabriel*jon_anderson*sting*jefferson_airplane*
  -> James Graham, Songwriter, Musician, Programmer and Hopeless Romantic <-
  greywolf@starwolf.com