Subject: kern/11287: select() fails when timeout is > 100,000,000 sec.
To: None <gnats-bugs@gnats.netbsd.org>
From: None <lainestump@rcn.com>
List: netbsd-bugs
Date: 10/22/2000 09:57:12
>Number:         11287
>Category:       kern
>Synopsis:       select() fails when timeout is > 100,000,000 sec.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 22 09:57:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Laine Stump
>Release:        NetBSD 1.5_ALPHA2-i386 as of Oct 18, 2000
>Organization:
>Environment:
	
System: NetBSD idris.laine.org 1.5_ALPHA2 NetBSD 1.5_ALPHA2 (GENERIC-lrs) #0: Thu Oct 19 01:49:50 EDT 2000 laine@idris.laine.org:/drive2/src/src/sys/arch/i386/compile/GENERIC-lrs i386


>Description:

If select() is called with a tv_sec > 100000000, it will fail with
EINVAL.

This is especially bad in the case of dhclient, which does a select
with timeout = the lease expire time. If the lease is extremely long
(as it is in the case of RCN cable modem service), select will instead
return immediately with EINVAL, causing dhclient to chew up all
available CPU time.

The problem is that sys_select() calls kern_time.c:itimerfix(), which
returns EINVAL if tv_sec is > 100000000. This seems rather arbitrary -
maybe I *want* to set a timeout of 1000000001 seconds!

Note that I've set this at a high priority because it is the cause of
a serious problem in at least two running systems (mine, and Peter
Seebach's) - the system is really unusable for me without a patch, and
the same will probably happen to others (anybody else with RCN cable
modem service, at least).

>How-To-Repeat:

1) write a short program that calls select with a timeout of 100,000,001 sec.

2) run dhclient and get a lease with (for example) a 30 year
expiration, watch dhclient eat all your CPU time.


>Fix:

Several possibilities, and I'm not qualified to decide which is best (although I have an opinion ;-):

1) modify select to not call itimerfix(), and to do its own limit
   checking/setting (in case there are other places that call
   itimerfix and really do need this behavior)?

2) modify itimerfix() to silently reset tv_sec if it's too large
   (sounds like a very bad idea to me).

3) modify itimerfix() to not modify anything, but also not return
   EINVAL for extremely large settings for tv_sec (this seems most
   logical to me, but there may be some other use of itimerfix() that
   I'm not aware of which depends on this behavior).

4) document in the select manpage (and for any other function that
   sends timevals through itimerfix() that the maximum value for
   tv_sec is 100,000,000 (yeah, right).

>Release-Note:
>Audit-Trail:
>Unformatted: