current-users: Re: dhclient using 95% of CPU - found problem! Need to fix.

Subject: Re: dhclient using 95% of CPU - found problem! Need to fix.
To: Peter Seebach <seebs@plethora.net>
From: Laine Stump <lainestump@rcn.com>
List: current-users
Date: 10/22/2000 13:02:27
seebs@plethora.net (Peter Seebach) writes:

> If someone who knows anything about DHCP would be willing to help,
> I'm interested in debugging this.  Once my sup is done, I plan to
> build a debugging version from source, and start using the naive,
> obvious, approach; run it a bit, then start tracing and watch where
> the loop is.

Well, I don't know much about dhcp, but I used the "naive, obvious
approach", and found the source of the problem.

The loop is in dhcp/common/dispatch.c:dispatch(), which is calling
dhcp/omapip/dispatch.c:omapi_one_dispatch(). This *should* be
happening, but omapi_dispatch() is returning from all its selects
immediately, so it never pauses.

A closer look shows that the select on line 287 is returning count =
-1, with errno set to EINVAL. When it is called, the value of the
timeval sent is:

(gdb) print to
$17 = {tv_sec = 1175253196, tv_usec = 847872}

If you look at sys/kern/sys_generic.c:sys_select() to see what can
cause it to return EINVAL, you'll notice that can happen if the call
to itimerfix() returns non-0. When you look to kern_time.c:itimerfix,
you'll see that it returns EINVAL if tv_sec of the timeval is larger
than 100,000,000. BZZZZZZTTTTTTTT!!! 

The reason Peter and I are seing this error (and other people aren't)
is because we happen to be getting incredibly long dhcp leases.

So, the question of the day is - why are we limiting timevals to a
maximum of 100,000,000 seconds? Sure, it seems like a good idea to
prevent some idiot who happens to be using an interval timer in the
kernel from locking everything out for 100,000,001 seconds, but that
doesn't seem reasonable for applications, right? (after all, an app
can set the select timeout to *infinity* if it wants!) (and anyway,
why allow 100,000,000, but not 100,000,001? It seems rather arbitrary,
and way beyond the bounds of "I don't want anything locked up for
*unreasonably* long periods").

What is the proper fix for this?

1) modify select to not call itimerfix(), and to do its own limit
   setting (in case there are other places that call itimerfix and
   really do need this behavior)?

2) modify itimerfix() to silently reset tv_sec if it's too large
   (sounds like a very bad idea to me).

3) modify itimerfix() to not return EINVAL for extremely large
   settings for tv_sec (this seems most logical to me, but there may
   be some other use of itimerfix() that I'm not aware of which
   depends on this behavior).

4) document in the select manpage (and for any other function that
   sends timevals through itimerfix() that the maximum value for
   tv_sec is 100,000,000 (yeah, right).

Whatever is done, I think it should be done in the 1.5 branch prior to
release. I just filed pr kern/11287 for this.