tech-userlevel: Re: A report on implementing runlevels in NetBSD

Subject: Re: A report on implementing runlevels in NetBSD
To: None <tech-userlevel@netbsd.org>
From: Greg A. Woods <woods@most.weird.com>
List: tech-userlevel
Date: 12/03/1999 23:17:42
[ On Saturday, December 4, 1999 at 11:23:15 (+1100), Giles Lean wrote: ]
> Subject: A report on implementing runlevels in NetBSD
>
> Fancier setups could be done with configurable timeouts, respawn
> limits, up/down states, logging, whatever.  I can readily imagine a
> 'start' script settting this going, and a 'stop' script shutting it
> down.  There is some prior art for this sort of thing outside of SysV
> init.

You really do need the rate limiter -- I think almost every init on more
recent systems has had such a feature, even if they only control getty
processes.

> This functionality can be put into init, but it doesn't have to be.

It's a very logical extension to init's current job of re-spawning getty
processes (something that's much less important these days, I suppose).

> The second point, runlevels, are the more contentious.  I endeavour
> not to be religious about the SysV/BSD/Linux thing.  (I'm paid to
> support HP-UX. I can't afford to be religious; it's the oldest
> BSD/SysV hybrid there is, and has diverged from both as well.)

Looking at HP-UX (at least earlier releases) as an example of SysV is
like looking at Linux as an example of *BSD!  Almost the same goes for
SCO.  SunOS-5 has at least a direct lineage with SysVr4 development, but
even they didn't follow the true porting base.

> When I added runlevels to NetBSD's init I found:
> 
> (a) I had to add a kernel variable to store the runlevel
> 
>     SysV uses utmp, but in single user mode we don't have a filesystem
>     mounted read/write so that's not appropriate for us.  Adding a new
>     kernel variable isn't a real hassle (we already have securelevel,
>     for example) but it does mean that the kernel knows about
>     runlevels at least minimally.  Numerous people felt that this was
>     "unclean".  Maybe so.

SysV stores the runlevel only because they didn't bother to try and pass
the state transition information on the command-line or in an
environment variable.  There's really no need to store it someplace
semi-static if there's a well defined way of communicating it in some
other manner.

It's also stored in wtmp for logging purposes of course.

> (b) there are gross discrepancies in the current implementations
> 
>     At the time I did this work (1996) I investigated Solaris 2.5 and
>     HP-UX 10.x.  I had some memories of SCO as well, but they're just
>     about gone now.  (Both SCO and my memories. :-)
> 
>     (i)   the AT&T documentation is used by both Sun and HP, and a
> 	  naive reading suggests that runlevels are in fact levels,
> 	  with a defined ordering
> 
>     (ii)  an alternative school of thought says that the runlevels
>           should be independent states, with no implied ordering
> 	  [more discussion later]

They're definitely "states", not "levels", and only one of them has a
true pre-defined meaning.

>     (iii) nobody agrees about _which_ runlevels should be available,
>           and several are taken and are not available
> 
> 	  0           shutdown
> 	  1/S         single user mode	      
> 	  2	      former default multiuser (HP-UX 9.x, SCO?)
> 	  3	      current HP-UX default runlevel
> 	  4
> 	  5	      ?? reserved by Solaris for something (or is this 4?)
> 	  6	      reboot (some OSes only, definitely SCO)
> 
>           This suggests runlevels are a really scare resource: you get
>           the default, a couple of others, and then you're into OS
>           dependent territory again.

uh, that's a really slanted view of reality, I think.

The SysVr4 init(1m) manual page is quite explicit about what everything
means, and what state transitions are to be expected.  The SunOS-5.5
init(1m) page is a litle less explicit, but not confusingly or
conflictingly so.

Even way back in 1986 or slightly before when AT&T SysVr2.2 was first
released the "modern" SysV init was pretty well fully defined and it
hasn't changed very much.  What has changed is the refinement of the
scripts called by init to control system startup and shutdown.   Even
back then run-level "2" was "usually defined by the user to contain all
of the terminal processes and daemons that are spawned in the multi-user
environment" and of course the 's' state was completely defined.

I'll re-type the guts of the init(1m) manual page (from 4.2) so that
those without access to one can see it as verbatim as my typing skills
permit.  Folks can also compare it to the SunOS-5 manual page if they
happen to have access to the latter (my editorial comments in []):

	0	Shut the machine down so it is safe to remove power.
		Have the machine remove power if it can.  [sort of like
		BSD "shutdown -p"]

	1	Put the system in system administrator mode.  All file
		systems are mounted.  Only a small set of essential
		kernel processes run.  This mode is for administrative
		tasks such as installing optional utilities packages.
		All files are accessible and no users are logged in on
		the system.  [the level-1 script ends by doing a
		"telinit s" to enter state 's']

	2	Put the system in multi-user state.  All multi-user
		environment terminal processes and daemons are spawned.

	3	Start the remote file sharing processes and daemons.
		Mount and advertise remote resources.  Run level 3
		extends multi-user mode and is know as the remote-file-
		sharing sate.

	4	Define a configuration for an alternative multi-user
		environment.  This state is not necessary for normal
		system operations; it's usually not used.

	5	Stop the UNIX system and enter firmware mode.
		[i.e. equivalent to BSD "shutdown -h"]

	6	Stop the Unix system and reboot to the state defined by
		the "initdefault" entry in "inittab".  [i.e. equivalent
		to BSD "shutdown -r"]

	a,b,c	Process only those "inittab" entries for which the run
		level is set to 'a', 'b', or 'c'.  These are
		pseudo-states, which may be defined to run certain
		commands, but which do not cause the current run level
		to change.

	S,s	Enter single-user mode.  When the system changes to this
		state as the result of a command, the terminal form
		which the command was executed becomes the system
		console.  (The terminal device is linked to /dev/syscon.)

		This is the only run level that doesn't require the
		existence of a properly formatted "inittab" file.  If
		this file does not exist, then by default the only legal
		run level that "init" can enter is the single-user mode.

		The set of file systems mounted and the list of
		processes killed when a system enters the system state
		's' are not always the same; which filesystems are
		mounted and which processes are killed depends on the
		method used for putting the system into state 's' and
		the rules in force at your computer site.  The following
		paragraphs describe stat 's' in three circumstances: [1]
		when the system is brought up to 's' with "init"; [2]
		when the system is brought down (from another state) to
		's' with "init"; and [3] when the system is brought down
		to 's' with "shutdown".

		[1] When the system is brought up to 's' with "init",
		the only filesystems mounted are / (root), /var, and
		/stand.  (Two filesystem types, /proc and /dev/fd, are
		also mounted.)  File systems for user's files are not
		mounted.  With the commands available on the mounted
		file systems, you can manipulate the file systems or
		transition to other system states.  Only essential
		kernel processes are kept [sic -- started!] running.

		[2] When the sytem is brought down to 's' with "init",
		all [currently] mounted file systems remain mounted and
		all processes started by "init" that should be running
		only in multi-user mode [i.e. other states] are killed.
		Because al login related processes are killed, users
		cannot access the ssytem while it's in this state.  In
		addition, any processes for which the "utmp" file has an
		entry will be killed.  This last condition ensures that
		all port monitors started by the Service Access
		Controller (SAC) will be killed and all services started
		by these port monitors, including "ttymon" login
		services, will be killed.  (The SAC is a daemon that
		maintains the port monitors on a server machine in the
		state specifiedby the system administrator.)  Other
		processes not started directly by "init" (such as
		"cron") will remain running.

		[3] When you change to 's' with "shutdown", the system
		is restored to the state which it was running when you
		first booted the machine and came up in single-user
		state, as described above.

[[ wow, I'd fogotten how bad the USL technical writers had gotten in the
latter days! ]]

There are also "boot" and "bootwait" entries for running early critical
system initialisation procedures (I've seen them used to load firmware
into controllers, etc.), and "powerfail" and "powerwait" entries which
can be used to perform an orderly emergency shutdown.

Now there's a whole lot more interesting stuff about SysVr4 init that
one could discuss, but the above should hopefully demonstrate that any
SysV vendor who stuck strictly to the porting base (eg. Commodore Amiga,
NEC, Fujitsu, Motorola, Pyramid, ICL, etc.) did not have any confusion
whatsoever about the definition of runlevels.  Some vendors,
particularly those like HP, SCO, NCR, IBM, and so on had gotten
themselves all confused by the early SysVr2 attempts at making use of
this new inittab facility, and to pay any attention to them is
counter-productive.

Tonnes and tons of confusion about the detailed meaning of "remote
services" came about with the initial attempts by AT&T to satisfy US
Govmt. requirements to support OSI while at the same time continuing to
fit into the then academic world with TCP/IP.  I still have a soft spot
for TLI's attempt to hide network transport differences, but I think we
can all agree that for now there's no need to get mixed up in this
side-topic at the moment!  ;-)

Certainly there are also things that could be simplified somewhat, such
as the multi-faceted meaning of 's'!  :-)

>     (iv)  there is no standard numbering scheme used by the vendors
> 	  for startup and shutdown scripts, and no place for application
> 	  vendors to register to get numbers, so vendor names for
> 	  startup and shutdown scripts are probably OS specific

I don't see how this is relevant.  The numbering is merely there to
control the ordering of execution.  There were very standard and highly
portable scripts included in the porting base for SysVr4, and for the
most part these scripts survive intact in vendor systems.

>     Minor nit (just thought of this one, brand new today :-) is that
>     inetd has no knowledge of runlevels.  Altering /etc/inetd.conf
>     under inetd and signaling it on a change of runlevels is clunky.
>     (Sure, we can do atomic file updates to survive across crashes,
>     but this is getting uglier ...)

In the SysV world inetd is a wart.  The SysV way of doing things was a
transport independent tool called "listen", which unfortunately was
never fully developed to properly host TCP/IP services (though I did use
it exclusively on 3B1's for telnet access).

If you use "inetd" instead of "listen", then you simply start it in
run-level 2 and/or 3 and/or 4, and kill it when leaving those levels.
You'd probably also put it directly in /etc/inittab and take the
daemon(3) call out instead of starting it in a script.  Scripts are
really primarily for doing things, not starting and stopping things.

This is actually a pretty important point, now that I think of it.
BSD-style daemons that run on their own, backgrounding themselves, etc.,
are very rare critters in the true SysV world.  They are normally only
started and watched by init (or SAC).  Cron, SAC, and init itself, are
about the only stand-alone daemons (at least up until people started
adding things from the BSD world, like inetd).  Everything else is
stared by init, SAC, or cron and normally runs in the "foreground" just
as any other program, as of course they must if their parent is to watch
over them and restart them when they die, etc.

> Solaris is closer to run _states_ and in a transition from 4 to 2 will
> only run the kill scripts in level 2, then the start scripts in level
> 2.  This turns out to be really awkward: I can't simply put the kill
> scripts for a particular level one level down, since they won't always
> be run.  Either I have to enforce an administrative policy that says
> "never jump runlevels" or I have to put *all* the kill scripts for
> every service in every lower level.

Of course you can't simply put the kill scripts one level down -- you
have to really think about the state transitions and what they mean.

> 3. Runlevels in SysV [...] provide too few runlevels for very flexible usage.

I've never found any good use for init states 'a-c', and indeed it's
hard to find a generic use for even three 'multi-user' states.  Even the
"traditional" remote-file-sharing sate only makes sense on big general
purpose computing servers that primarily have local users.  I've used
level '4' for special tasks, such as backups (shut down just the
database and run the backup scripts), etc. but that's about it.  I don't
have any problem with adding more run levels, but I don't really see the
point -- too much rope....

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>