Subject: Re: momentary freezes in -current
To: Martijn van Buul <martijnb@atlas.ipv6.stack.nl>
From: Andreas Gustafsson <gson@gson.org>
List: current-users
Date: 08/06/2006 21:52:33
Martijn van Buul wrote:
> Hmm. On my system I do get those stalls as well, but they're infrequent (Once
> every 2 or 3 weeks, maybe), I do _not_ get the watchdog message, and I've
> found several peculiarities: 
> 
> * Restarting dhclient fixes the problem - at least temporarily.

This sounds like a problem I had recently.  In my case, dhclient was
taking too long to renew leases, apparently sometimes long enough that
the lease expired and the equipment at other end of my DSL line
stopped routing traffic to my machine's (now former) IP address.  The
following log messages demonstrate an instance of this; here it's not
actually taking long enough for the lease to expire, but still far
longer than it should:

   Aug  2 21:14:20 guava dhclient: DHCPREQUEST on fxp0 to 255.255.255.255 port 67
   Aug  2 21:14:20 guava dhclient: DHCPACK from 88.112.240.1
   Aug  2 21:14:20 guava dhclient: bound to 88.112.240.109 -- renewal in 3442 seconds.
   [...no other dhclient messages...]
   Aug  2 22:28:15 guava dhclient: DHCPREQUEST on fxp0 to 193.229.28.26 port 67
   Aug  2 22:28:15 guava dhclient: DHCPACK from 193.229.28.26
   Aug  2 22:28:15 guava dhclient: bound to 88.112.240.109 -- renewal in 3474 seconds.

Notice how dhclient prints "renewal in 3342 seconds", but the renewal
actually happens only after 4435 seconds.

I added some debug printfs to dhclient and found that although it was
calling select() with the correct timeout, the select() call didn't
return when scheduled.  I also ran the following test:

   $ time perl -e 'select(undef,undef,undef,3442)'
   real    84m27.443s
   user    0m0.005s
   sys     0m0.005s

In other words, a select() with timeout of 3442 seconds was actually
taking 5067 seconds to complete.

This was with a -current from July 4.  I upgraded to -current from
Aug 4, and the problems seems to have disappeared.

This is NetBSD/i386 running on a single (sic) AMD Athlon MP 1800+ CPU.
-- 
Andreas Gustafsson, gson@gson.org