tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: TCP connections clogging up accf_http(9) (was: ESTABLISHED sockets with no fd open? (was: generating ECONNRESET with no syscall?))



    Date:        Thu, 19 May 2016 22:51:23 -0400
    From:        Thor Lancelot Simon <tls%panix.com@localhost>
    Message-ID:  <20160520025123.GA9344%panix.com@localhost>

Now that Thor's mail (quote a while ago, I deferred sending this)
has really arrived (don't ask!) ...

  | The accept filter isn't really "occupied".  It has no local state and
  | is just called from soisconnected() when an event happens on the socket
  | (e.g. data is received).  Sockets just sit on so->so_q0 for the listen
  | socket until the accept filter lets them go.

Since, thanks to Michael van Elst, we now know this is almost certainly the
issue perhaps that is where a timeout needs to be in general.  Nothing should
live on that queue for more than a few minutes, ever - whether the cause
is just a buggy server that isn't bothering to accept() when it could,
or a filter preventing the accept receiving the connection, nothing should
ever be left in limbo on the listen queue for very long.

  | If you see where it is, the fix should be to dequeue the oldest
  | filtered connection on the listen socket

I don't think I'd do it that way.   99% of applications that are filtering
incoming connections are not going to simply want to delay them to a point
where they then have to delay again.

That is, if a filter is waiting for data (accf_data) and no data has
arrived, then all the application can do is wonder why the filter failed,
and wait for data itself.   Obviously it can set a timeout and abort the
connection after a while - but it has no way of knowing that the connection
has already been queued in the kernel for hours (or how many hours, it might
be just fractions of a second if the incoming connection request rate is very
high and the queue fills quickly.)

Similarly for an incomplete but not invalid http request via accf_http

In those cases (which is currently, I think, all cases) the better solution
is just to reset the connection inside the kernel, so that the appliction
never discovers it was ever attempted.   As I understand it, by this
stage it is too late to simply ignore it as we would do if the listen
queue is full when the SYN request arrives (encouraging the client to
try again in a few seconds, by which time we hope that the queue will
have drained.)

In the case of a server that is broken, and not accepting connections, that's
also clearly what is needed.

But if a new (presumed new) setcockopt() was added that allows a server to
control the timeout for filtered connections, it should also allow it to
decide between reject and accept old pending connections - in case it wants
to log them, or take more drastic counter measures in the event it appears
as if it might be an actual attack attempt (like installing a bpf filter
to block packets from the source of repeated attempts).

I assume it ought be possible to use the packet arrival time in the mbuf
header to work out how long a connection has been pending in the queue ?

kre





Home | Main Index | Thread Index | Old Index