tech-userlevel archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
How Unix manages processes in userland
So, daemons - watched over or not?
There's been a philosophical split (sort of) between the BSD/v7 school of
thought and the System III/V school of thought, on how the Unix system should
startup and manage userland processes where (background) daemons are concerned,
with the structure and function of process 1 (whether you call that
/{etc,sbin}/init or something else) at the center of it.
Process 1 has always been responsible for reaping processes whose parents did
not do so (e.g. processes not started by shells), but what it does with those
events beyond simple status collection has varied.
In BSD-land, derived from Version 7 Unix (or 7th Edition, if you prefer),
daemons are expected to fork into independent processes and their parents exit,
leaving them with the default parent of process 1. Monitoring independent
daemons has been somewhat ad-hoc and messy; we've left process 1 more or less
alone in this regard; it merely collects status (with one exception).
In USG/System III/V-land, they opted to add a more general process monitoring
facility into process 1, in what's known as "inittab" (or by other names in
other variants). I'd argue: good idea, pretty terrible implementation - when
System III shipped, it was clear to me what needed to be done: the
"daemonization" routines of all daemons run by inittab needed to be removed (no
more fork/exit) so that /etc/init could properly manage those processes, and
restart them when they died (and terminate them when the system is being shut
down). I did that in the early 1980's at Dual Systems, a small mc68k Unix-box
manufacturer that was my employer, and it worked well. Had to redo all the work
for System V, alas (it always annoyed me that USG/AT&T added new facilities
like inittab and then didn't perform the necessary code rototill for the system
to properly use them), and I've never liked inittab's "run levels".
It is important to note right here that there's one area in which both schools
of thought agreed: user login sessions initiated by getty & login on ttys
needed to be explicitly managed by process 1. One can argue that USG/System III
people merely extended that model to daemons, too.
One can also argue that BSD simply didn't change what had been inherited from
v7 Unix, and then went and did its own thing when TCP/IP (network) sessions
over telnet, rlogin, et alia, showed up. You don't have to hang getty off a pty
(unlike a tty) to accept a network user session. A good thing, too, but that's
where we get inetd(8) from.
Why isn't that inetd stuff in process 1, too? Reasonable fear of code bloat &
bugs, I suppose, and a philosophy that process 1 needs to be as simple as
possible so that it can be reasonably expected to work properly (after all, if
process 1 dies unexpectedly, all kinds of bad bad things happen).
One more important aside that we should consider: "user sessions" now come in
more flavors than person pounding on a tty (pty) and a shell (or three):
there's FTP logins, IMAP/POP logins, and so on. There have been some attempts
at reflecting those in utmp(5) but I don't think anyone has been consistent
about it. I think we ought to tie that stuff into the basic authentication
libraries, i.e. when a user authenticates for something, if it's going to last
more than a second or three (i.e. a user is asking for a "session"), it ought
to get an entry in utmp(5) and wtmp(5) so that you can see with who(1) or w(1)
the users of the system and what sort of session they're in.
We're NetBSD - it should be easy to see what the Network users are doing in our
systems (never mind http or NFS, for now …).
End of "user session" digression - back to daemon management.
NetBSD's rc.d(8) system is great - proper dependency management, and it's easy
to manually start, stop, or restart a given daemon or service, but we totally
fall down on daemon monitoring - they're expected to "just work" (perfect
code!) and if they're important enough, someone will notice and manually
restart when they die. Or not.
I've had some problems with that - named(8) likes to die on some of my systems
because it's a big, complicated beast, and the Internet now encompasses enough
of the world that the totality of all code paths through named are being
relatively regularly exercised and bugs discovered quite rapidly in deployment,
but not fixed anywhere near fast enough. So, I wrote a little shell script for
cron(8) the other day to keep those daemon processes that are polite enough to
leave a PID file in /var/run alive, and after testing in my own environment, I
posted it to tech-userlevel for those who might also be having the same
problems. It's a simple, somewhat hacky patch to a design deficiency in NetBSD.
The right place to deal with all of this is in process 1. It is deemed
responsible for startup & shutdown of system, which mode (single user,
multi-user) to run in, the secure levels (ugh) and ultimate reaping of all
processes, so it "knows" a priori whether a daemon should be running or not and
can know whether it is provided the relationship between a daemon (service) and
its PID is known. The trick is in expressing in some kind of configuration
system what we want in a simple but hopefully sufficiently rich syntax.
However, I don't like either of the two schemes I've seen to date for dealing
with the issue. I've already expressed my distaste for inittab(5) as I've seen
it (has Linux done something more sensible with it in the last many years?),
and I had a look at Apple's OS X "launchd" and I don't like it either - it
really wants to be talked to through a control program interface (launchctl,
with yet another control language to learn) rather than allowing one to simply
edit configuration files.
Worse, neither system has proper dependency management as we have in rc.d(5),
and I really, really don't want to lose that.
So, clear statement of the problem: daemons should be started and managed by
process 1 because it is in a position to monitor for their death and restart
them as necessary, and log those deaths (kern.logsigexit is OK but not really
the right thing, and I was the one who ported it from FreeBSD), but we need a
configuration system for process 1 that not merely names all the daemons
(services) to be started/stopped, but also expresses dependency for both
startup and shutdown.
your comments and thoughts are solicited,
Erik <fair%netbsd.org@localhost>
Home |
Main Index |
Thread Index |
Old Index