tech-kern: Re: Scheduler hints

Subject: Re: Scheduler hints
To: None <elric@imrryr.org>
From: M. Warner Losh <imp@bsdimp.com>
List: tech-kern
Date: 12/09/2002 06:16:59
In message: <20021209094448.47D29174D2@arioch.imrryr.org>
            Roland Dowdeswell <elric@imrryr.org> writes:
: On 1039422164 seconds since the Beginning of the UNIX epoch
: "M. Warner Losh" wrote:
: >
: 
: >There are a number of flaws in the arguments put forth against the
: >patch:
: >
: >	1) A system on the edge will be tottered over: This is true
: ....
: >	2) Rampant forkers.  This will hurt them.  These beasts should
: ....
: 
: I think that it would be unfair to characterise my arguments as
: either of these.  If you examine the patch, you will find that the
: .5s pause is invoked if a process reaches its user process limit.
: A single program forking off 160 children/grandchildren/etc is by
: no means a system on the edge of being tottered over.  It could
: be, e.g. a web server responding to an uncharacteristically large
: number of requests if, e.g. the web server has been /.ed.  With
: the change, as I pointed out, certain programs, such as thttpd,
: are very likely to completely fail where before they would just
: happily churn along giving a few failures here and there.

I don't see how they will completely fail.  At most they will fail
1/2s later than they would have otherwise when processes have run
out.  If thttpd is forking lots of programs, it is a rampant forker
and deserves to be throttled back, imho.  What is causing it to
rampantly fork?  Lots of requests.  when you run out of process slots,
for whatever reason, you have reached your ability to service them.  I
will admit that once you've exceeded your limit, it will slow down for
the remaining processes.  However, I consider thttpd's behavior to be
abusive of system resources, and worthy of punishment if it is too
abusive.  Others see this as a reasonable thing.

I don't see how thttpd will completely fail.  Consider the following.

Let us say that it is forking something that takes .1s to run.  Let us
say that we have 100 slots free on a system in question.  This means
it can still service 1000 requests per second before it hits the
process limit.  When it gets the 1001th request, what happens, it
stalls for 1/2 second.  The connection requests queue up for 1/2
second, and then are handled in quick succession.  So 100 will be
happy and it will pause 1/2 second, so the rate it can service will
drop to 200/s.  This clearly isn't ideal in this case, but it is still
functioning and not completely failing.  This aruges for a tuning
parameter for the sleep time at most (since setting it to 0 restores
the old behavior).

In a system that has reached its limits and isn't returning EAGAIN to
the fork, similar rate limiting happens because there's a pause on
each fork.

: >	3) But with the one slot, a sysadmin could regain the system.
: 
: I don't think that there is any general consensus here that upping
: the number of root-only slots is a bad idea.

True.  I guess there was only one voice saying this.

Warner