Subject: Re: Recent nbftp panic analysis
To: None <tech-net@netbsd.org>
From: Charles M. Hannum <abuse@spamalicious.com>
List: tech-net
Date: 02/14/2005 09:29:12
On Monday 14 February 2005 09:16, Charles M. Hannum wrote:
> On Monday 14 February 2005 09:01, Charles M. Hannum wrote:
> >  #define        TCP_TIMER_ISARMED(tp, timer)\
> > -       callout_pending(&(tp)->t_timer[(timer)])
> > +       (callout_pending(&(tp)->t_timer[(timer)]) ||                    \
> > +        callout_invoking(&(tp)->t_timer[(timer)]))
>
> This has to use callout_expired() rather than callout_invoking(), because
> callout_ack() clears the INVOKING bit and therefore the race could still
> happen (but with a yet shorter window).

Okay, last thread-of-consciousness posting here...

It actually doesn't matter, because the whole body of the callout handler is 
wrapped in splsoftnet(), so the whole chunk including the callout_ack() and 
tcp_setpersist() is atomic WRT tcp_input().

This also means that the original race is limited to *just* the window between 
the CALLOUT_UNLOCK() in softclock() and the splsoftnet() in 
tcp_persist_timer() -- a very short window, coupled with the requirement that 
we happen to receive a packet for the "right" connection within that window.  
This is probably why it's rarely observed.

soda noted that FreeBSD uses a separate CALLOUT_ACTIVE flag, which is not 
cleared when a callout expires.  I believe this is equivalent to my modified 
test.