Subject: Re: popen reentrant (was Re: SA/pthread and vfork)
To: Matthias Buelow <mkb@mukappabeta.de>
From: Greywolf <greywolf@starwolf.com>
List: tech-kern
Date: 09/14/2003 13:01:46
Thus spake Matthias Buelow ("MB> ") sometime Today...

MB> What are these applications?  I can't right now think of any which
MB> produce a steady stream of fork/exec, enough to satisfy the CPU with
MB> only these operations...  Even huge, make-directed build jobs spend
MB> almost all of their time in compilers, or on disk i/o, etc.

What happens if you have a HUGE process which needs to run a fork()/exec()
of a program with a (probably) much smaller footprint in RAM?

[parroting what I've just learned about vfork():]
If you fork(), even if you don't touch any of the pages before calling
exec(), you still have to copy all the parent's pages into the child.
This could potentially run you out of (virtual and physical) memory.

If you vfork() (if I am understanding correctly), you are basically
allocating pointers to the parent's pages, with very few localisms to the
child of the vfork() -- enough to close, open and ioctl some descriptors
before continuing (not all of that is desirable, necessarily...).

[Side note:  This is _why_ you don't muck around with too much of the
state inside a vfork():  You can inadvertantly free up some of the
resources claimed by the parent.]

Say you have a 400MB process.  You fork().  Congratulations.  You're now
up to 800MB (And Constantly Swapping, most likely).  You've just spent time
COPYING a 400MB footprint into memory.  Do you have that memory free?
Don't think it can't happen!

Now you've just wasted time on the copy to do an execv("/bin/csh", argv) in
the child.  /bin/csh is 428*K*, less than half a MB of memory.

By comparison, you vfork() and you end up with a significantly smaller
foot print in memory, which means less paging of stuff.  You've also
just made a significant save on copy time (PLUS you didn't have to
fault in all the pages of the parent to copy them).

MB> Is vfork mandated by any standards?  If not, it's my belief that it
MB> should go; maybe kept in a compatibility library for legacy
MB> applications, using ordinary fork/exec.  IMHO a small performance
MB> improvement doesn't warrant adding elements to the system API.

vfork() has been a part of the system API since 3.0BSD.  If you can
come up with a better way to do a fork()/exec() which does The Right
Thing with regard to memory usage, please do.  I'm sure there must
be one.

MB>  I don't
MB> know the original intentions but I'd guess that, when vfork() was
MB> introduced, it was meant to be a temporary hack, a workaround around
MB> the CPU bug, and not to stay forever.  It is ugly, it is inelegant, it
MB> doesn't fit at all in the API.

1.  temporary, probably;
2.  CPU bug workaround, granted;
3.  ugly, how?  It creates a small space for a specific purpose.
4.  inelegant?  Hmmm...
5.  It's been part of the API since 3.0BSD.  That's well over 20 years!

MB> And it is creating problems, as we can
MB> see in this thread and people have to spend time to support this crude
MB> hack, which doesn't add any functionality by itself, and fix problems
MB> related to it.  NetBSD aims to be consistent and clean, how does
MB> something like vfork fit in?

vfork() problems only appear to have resurfaced wrt threading of late.
It doesn't add any functionality, per se, but it does attempt to optimise
the use of resources (such as avoiding a pointless copy on something
that isn't going to be used).

vfork() has its own caveats, to be sure, and anyone doing a vfork() in
their program is (or needs to be) aware of what those problems are.

MB> At the very least it should be marked as deprecated, and it certainly
MB> should not be used for new applications (it's unportable anyways).

I disagree.

In fact, I'm wondering if there could be some way at some point to
pass an advisory flag to fork() [I think there should have been in
the first place, but I can see that the fork() semantics were needed
fairly immediately, so there was really no room for embellishment on
the basic functionality at the time, and now we're stuck with that].
Either that, or just as there is madvise(), perhaps fadvise or padvise
or something to say, "Hey, I'm not planning on using a fork() for anything
except a soon-to-follow exec()-or-_exit()."

Just a thought.

But I don't think that shared-reference fork() is at ALL out of the
question.  As far as violation of the API, to which API do you refer?

If you refer to general UNIX, I will point out that there are no two alike.
If you refer to POSIX, I'll blink and ask you if POSIX actually has an API.
If you refer to BSD, I'll tell you ... well, see above :-)

[I could be completely clueless, but I'm always learning, so be nice.]

				--*greywolf;
--
NetBSD: a devil of an operating system.