NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: something is randomly closing ssh-tunnels (was: ipfilter randomly dropping..)



On Tue, Jun 24, 2014 at 10:39:46PM +1000, Darren Reed wrote:
> 
> So the problem is this:
> * sshd tries to write to the socket, gets ENETUNREACH
> 
> and then exits leading to the FIN packets being transmitted as the socket
> is closed down in the normal course of things but by the time it is doing
> the exit the network path has restored.

Right.   Here is the (slightly redacted) output of ktruss:

        http://smokva.net/pcap/crane-trace-sshd.txt


> You've got public IP addresses in your capture file and you've made no
> mention of using NAT, so I'm going to assume that the box with sshd/ssh
> on it are connected to the Internet directly with some kind of cable modem
> or similar.

The client connects through a FreeBSD router where an older
pf-version is doing filtering, nat and altq (priq).

The server has a public IP.


> Are you able to cross check the events from sshd with log data from those
> devices?

The most cross checking I've done is comparing the two pcaps.  There is
not much to cross check otherwise, since nothing really complains about
anything..

Also the ssh tunnels seem to work fine with ipf disabled.  I'm running a
third >3GB job today with ipf disabled.  Without disabling ipf the job
wouldn't survive longer than 10-30 seconds.


> How about these two for me:
> netstat -s | grep -i unreach
> netstat -s | grep -i route

Sure:

# netstat -s | head -n20
icmp:
        413 calls to icmp_error
        0 errors not generated because old message was icmp
        Output histogram:
                echoreply: 100
                unreach: 413
                photuris: 6
        4 messages with bad code fields
        0 messages < minimum length
        0 bad checksums
        0 messages with bad length
        0 multicast echo requests ignored
        0 multicast timestamp requests ignored
        Input histogram:
                echoreply: 6
                unreach: 731
                echo: 100
                timxceed: 102
        100 message responses generated
        0 path MTU changes

# netstat -s | grep -i unreach
                unreach: 413
                unreach: 731
                0 dropped due to ICMP unreachable
                0 address unreachable
                0 port unreachable
                0 dropped due to ICMP unreachable
        0 packets rcvd for unreachable dest 

# netstat -s | grep -i route
        0 SYNs dropped (no route or no space)
        0 output packets discarded due to no route
        0 no route available (output)
        17313 output packets discarded due to no route
                0 no route
        0 bad router solicitation messages
        0 bad router advertisement messages
        0 router advertisement routes dropped
        0 SYNs dropped (no route or no space)
        0 no route available (output)


> And of course the other important thing to do in an experiment is to save
> the output of "netstat -s" at the start of a run and compare that with its
> output when the problem has been seen again.

Here is the diff of two files I created before and after a failed job:

# ls -la netstat.*
-rw-r--r--  1 root  wheel  16332 Jun 25 09:58 netstat.a
-rw-r--r--  1 root  wheel  16332 Jun 25 09:59 netstat.b

# diff -u netstat.a  netstat.b 
--- netstat.a   2014-06-25 09:58:29.000000000 +0200
+++ netstat.b   2014-06-25 09:59:16.000000000 +0200
@@ -29,61 +29,61 @@
        0 membership reports received for groups to which we belong
        0 membership reports sent
 tcp:
-       36143366 packets sent
-               6650618 data packets (13929288006 bytes)
+       36178462 packets sent
+               6684489 data packets (14022704288 bytes)
                4682 data packets (4775110 bytes) retransmitted
-               17455512 ack-only packets (27272960 delayed)
+               17456129 ack-only packets (27274542 delayed)
                0 URG only packets
                0 window probe packets
-               12029336 window update packets
-               8939 control packets
+               12029943 window update packets
+               8947 control packets
                0 send attempts resulted in self-quench
-       48058930 packets received
-               3572420 acks (for 13901480535 bytes)
-               50912 duplicate acks
+       48083472 packets received
+               3589267 acks (for 13994738366 bytes)
+               51243 duplicate acks
                5 acks for unsent data
-               42343900 packets (64839052960 bytes) received in-sequence
-               13840 completely duplicate packets (852212 bytes)
+               42346063 packets (64885718854 bytes) received in-sequence
+               13866 completely duplicate packets (852212 bytes)
                1651 old duplicate packets
                781 packets with some dup. data (277997 bytes duped)
-               955822 out-of-order packets (1113010976 bytes)
+               955823 out-of-order packets (1113010976 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
-               496160 window update packets
-               9207 packets received after close
+               499885 window update packets
+               9215 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
-       776 connection requests
-       17510 connection accepts
-       18155 connections established (including accepts)
-       18553 connections closed (including 1589 drops)
+       780 connection requests
+       17520 connection accepts
+       18167 connections established (including accepts)
+       18570 connections closed (including 1591 drops)
        7 embryonic connections dropped
        0 delayed frees of tcpcb
-       3522932 segments updated rtt (of 1688436 attempts)
+       3539781 segments updated rtt (of 1692544 attempts)
        2935 retransmit timeouts
                146 connections dropped by rexmit timeout
        1 persist timeout (resulting in 0 dropped connections)
        176 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
-       86368 correct ACK header predictions
-       41338655 correct data packet header predictions
-       48297 PCB hash misses
-       249 dropped due to no socket
+       86473 correct ACK header predictions
+       41340354 correct data packet header predictions
+       48346 PCB hash misses
+       255 dropped due to no socket
        3 connections drained due to memory shortage
        204 PMTUD blackholes detected
-       859 bad connection attempts
-       17900 SYN cache entries added
+       871 bad connection attempts
+       17910 SYN cache entries added
                0 hash collisions
-               17510 completed
+               17520 completed
                0 aborted (no space to build PCB)
                45 timed out
                0 dropped due to overflow
                0 dropped due to bucket overflow
                345 dropped due to RST
                0 dropped due to ICMP unreachable
-               17855 delayed free of SYN cache entries
+               17865 delayed free of SYN cache entries
        307 SYN,ACKs retransmitted
        129 duplicate SYNs received for entries already in the cache
        0 SYNs dropped (no route or no space)
@@ -93,18 +93,18 @@
        0 packets with ECN CE bit
        0 packets ECN ECT(0) bit
 udp:
-       111819 datagrams received
+       111837 datagrams received
        0 with incomplete header
        0 with bad data length field
        0 with bad checksum
        417 dropped due to no socket
        0 broadcast/multicast datagrams dropped due to no socket
        0 dropped due to full socket buffers
-       111402 delivered
-       70739 PCB hash misses
-       114114 datagrams output
+       111420 delivered
+       70752 PCB hash misses
+       114132 datagrams output
 ip:
-       47974139 total packets received
+       47998699 total packets received
        0 bad header checksums
        0 with size smaller than minimum
        0 with data size < data length
@@ -119,13 +119,13 @@
        0 malformed fragments dropped
        0 fragments dropped after timeout
        0 packets reassembled ok
-       47970480 packets for this host
+       47995040 packets for this host
        0 packets for unknown/unsupported protocol
        0 packets forwarded (0 packets fast forwarded)
        0 packets not forwardable
        0 redirects sent
        0 packets no matching gif found
-       36067735 packets sent from this host
+       36102877 packets sent from this host
        108 packets sent with fabricated ip header
        0 output packets dropped due to no bufs, etc.
        0 output packets discarded due to no route
@@ -139,8 +139,8 @@
        0 no route available (output)
        0 generic errors (output)
        0 bundled SA processed (output)
-       48291258 SPD cache lookups
-       48291258 SPD cache misses
+       48315827 SPD cache lookups
+       48315827 SPD cache misses
 
 IPsec ah:
        0 ah input packets processed
@@ -233,7 +233,7 @@
        0 fast forward flows
        0 packets not forwardable
        0 redirects sent
-       206472 packets sent from this host
+       206474 packets sent from this host
        0 packets sent with fabricated ip header
        0 output packets dropped due to no bufs, etc.
        17379 output packets discarded due to no route
@@ -292,61 +292,61 @@
        0 bad redirect messages
        0 path MTU changes
 tcp6:
-       36143366 packets sent
-               6650618 data packets (13929288006 bytes)
+       36178462 packets sent
+               6684489 data packets (14022704288 bytes)
                4682 data packets (4775110 bytes) retransmitted
-               17455512 ack-only packets (27272960 delayed)
+               17456129 ack-only packets (27274542 delayed)
                0 URG only packets
                0 window probe packets
-               12029336 window update packets
-               8939 control packets
+               12029943 window update packets
+               8947 control packets
                0 send attempts resulted in self-quench
-       48058930 packets received
-               3572420 acks (for 13901480535 bytes)
-               50912 duplicate acks
+       48083472 packets received
+               3589267 acks (for 13994738366 bytes)
+               51243 duplicate acks
                5 acks for unsent data
-               42343900 packets (64839052960 bytes) received in-sequence
-               13840 completely duplicate packets (852212 bytes)
+               42346063 packets (64885718854 bytes) received in-sequence
+               13866 completely duplicate packets (852212 bytes)
                1651 old duplicate packets
                781 packets with some dup. data (277997 bytes duped)
-               955822 out-of-order packets (1113010976 bytes)
+               955823 out-of-order packets (1113010976 bytes)
                0 packets (0 bytes) of data after window
                0 window probes
-               496160 window update packets
-               9207 packets received after close
+               499885 window update packets
+               9215 packets received after close
                0 discarded for bad checksums
                0 discarded for bad header offset fields
                0 discarded because packet too short
-       776 connection requests
-       17510 connection accepts
-       18155 connections established (including accepts)
-       18553 connections closed (including 1589 drops)
+       780 connection requests
+       17520 connection accepts
+       18167 connections established (including accepts)
+       18570 connections closed (including 1591 drops)
        7 embryonic connections dropped
        0 delayed frees of tcpcb
-       3522932 segments updated rtt (of 1688436 attempts)
+       3539781 segments updated rtt (of 1692544 attempts)
        2935 retransmit timeouts
                146 connections dropped by rexmit timeout
        1 persist timeout (resulting in 0 dropped connections)
        176 keepalive timeouts
                0 keepalive probes sent
                0 connections dropped by keepalive
-       86368 correct ACK header predictions
-       41338655 correct data packet header predictions
-       48297 PCB hash misses
-       249 dropped due to no socket
+       86473 correct ACK header predictions
+       41340354 correct data packet header predictions
+       48346 PCB hash misses
+       255 dropped due to no socket
        3 connections drained due to memory shortage
        204 PMTUD blackholes detected
-       859 bad connection attempts
-       17900 SYN cache entries added
+       871 bad connection attempts
+       17910 SYN cache entries added
                0 hash collisions
-               17510 completed
+               17520 completed
                0 aborted (no space to build PCB)
                45 timed out
                0 dropped due to overflow
                0 dropped due to bucket overflow
                345 dropped due to RST
                0 dropped due to ICMP unreachable
-               17855 delayed free of SYN cache entries
+               17865 delayed free of SYN cache entries
        307 SYN,ACKs retransmitted
        129 duplicate SYNs received for entries already in the cache
        0 SYNs dropped (no route or no space)
@@ -372,8 +372,8 @@
        0 no route available (output)
        0 generic errors (output)
        0 bundled SA processed (output)
-       48291258 SPD cache lookups
-       48291258 SPD cache misses
+       48315827 SPD cache lookups
+       48315827 SPD cache misses
 
 IPsec ah:
        0 ah input packets processed
@@ -467,11 +467,11 @@
        0 delivered
        0 datagrams output
 arp:
-       231 packets sent
+       232 packets sent
                0 reply packets
-               231 request packets
-       848 packets received
-               230 reply packets
+               232 request packets
+       849 packets received
+               231 reply packets
                618 valid request packets
                618 broadcast/multicast packets
                0 packets with unknown protocol type


Home | Main Index | Thread Index | Old Index