tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: something is randomly closing ssh-tunnels (was: ipfilter randomly dropping..)



On 23/06/2014 8:24 PM, Petar Bogdanovic wrote:
> During the past few weeks the ssh-tunnels to a remote machine started
> failing randomly.  In a previous mail to tech-net I prematurely blamed
> ipfilter because disabling it yielded some immediate success.
>
> Unfortunately, subsequent testing showed that having npf enabled instead
> eventually lead to the same issues.
>
> What I know:
>
>       * the server suddenly FINs the connection
>       * the server ignores everything after that and sends about 20-30
>         RSTs for lots of late ACKs sent by the client
>       * ipmon is able to track the connection but misses the FIN
>       * yet ipfilter manages to update its state table and reduces the
>         TTL of the connection from 24h to 30s
>       * a server-tcpdump captures the FIN
>       * a client-tcpdump captures the same FIN
>       * according to wireshark, the FINs in both pcaps have sequence
>         numbers that indicate lost segments (which at least in one
>         case makes little sense since it was captured directly at the
>         source)
>       * ssh and sshd both never try to tear down the connection
>       * ssh reports that the remote end has closed the connection
>       * sshd bails on a failed write() with ENETUNREACH

So the problem is this:
* sshd tries to write to the socket, gets ENETUNREACH

and then exits leading to the FIN packets being transmitted as the socket
is closed down in the normal course of things but by the time it is doing
the exit the network path has restored.

For ICMP packets to cause this, you would need to see many of them.

You've got public IP addresses in your capture file and you've made no
mention of using NAT, so I'm going to assume that the box with sshd/ssh
on it are connected to the Internet directly with some kind of cable modem
or similar.

Are you able to cross check the events from sshd with log data from those
devices?

For example, if the NIC facing outwards drops then you will get ENETUNREACH
because the destination with the default route has disappeared. Or if your
DHCP assigned IP address disappears briefly then again the route will
disappear
and ENETUNREACH.

How about these two for me:
netstat -s | grep -i unreach
netstat -s | grep -i route

And of course the other important thing to do in an experiment is to save
the output of "netstat -s" at the start of a run and compare that with its
output when the problem has been seen again.

Kind Regards,
Darren



Home | Main Index | Thread Index | Old Index