Subject: Re: Fw: Re: tcp connections lost on interface down
To: None <tech-net@netbsd.org>
From: Michael van Elst <mlelstv@serpens.de>
List: tech-net
Date: 08/21/2003 22:31:49
david@crlf.net (David Maxwell) writes:

David,

>So, firstly, either the server _or_ the client, can use application
>level implemented connection timeouts, if desired, or set SO_KEEPALIVE
>and do it the 'lazy' way...

SO_KEEPALIVE is not useful to handle broken connections. Did you look
at the timeouts involved ? I also do not want the idle connection to
break because of a problem somewhere in the network (as opposed to
a local problem). SO_KEEPALIVE doesn't distinguish, it is used to
get some (arbitrary) idea of what "waiting infinitely" should mean.
Originally SO_KEEPALIVE waited hours to even _start_ the keepalive
timer and days to abort a connection.


>Since the API doesn't provide a way to express 'This IP is going away
>forever' as opposed to 'This interface's IP is changing (saying nothing
>about a possible alias to the old IP to follow, for example)', it would
>be wrong to infer that interpretation and kill innocent connections.

If the system runs out of mbufs the connection will also fail,
it doesn't wait indefinitely for free memory to appear. So there
are conditions, completely outside the TCP protocol, that will
"kill innocent connections". Why ? Because it is reasonable to do so.


>> That's what I said. If TCP had been defined to recognize dropped
>> connections on an otherwise idle connection you couldn't change (up/down,
>>  change address, plug in and out) an interface and keep a connection
>> running and you wouldn't assume that this behaviour is normal.

>Please solve the following contradiction for me - how can you "recognize
>dropped connections" when the underlying network is connectionless?

Whatever protocol defines a "connection" will also define a "dropped
connection". That has little to do with the underlying network.


>I don't think it does - it only assumes that packets it sends will go to
>the destination address, and that that host will be able to send packets
>back. That will be temporarily untrue if an interface IP is changed, and
>might even be permanently true, but there's not enough information to
>decide between those two cases.

While that's true, it is also true for other cases like a write timeout.
Apparently there is no hard information that tells you that an ACK is
missing temporarily or permanently. That's why you (i.e. the inventors of the
TCP protocol) _defined_ what it means. You set a standard for a timeout,
and no ACK coming in later will change your perception of a dropped connection.

The same could be true for an interface reconfiguration where you don't
even have to agree with your peer about timeouts. A reasonable timeout
to drop connections with invalid local addresses would be the time the
peer would wait for an ACK until _it_ considers the connection dropped
(assuming it had something to send). It is reasonable because it allows
busy connections to survive a configuration change that finally ends
with a valid configuration (== one that makes the socket endpoint valid
again).

-- 
-- 
                                Michael van Elst
Internet: mlelstv@serpens.de
                                "A potential Snark may lurk in every tree."