tech-kern: Re: postfix broken by AF

Subject: Re: postfix broken by AF_LOCAL semantics change
To: <>
From: David Laight <david@l8s.co.uk>
List: tech-kern
Date: 11/29/2003 21:47:42

On Sat, Nov 29, 2003 at 04:11:13PM -0500, der Mouse wrote:
> >> Everyone agrees that connect() can block, right?
> > It doesn't work that way on virtually any Unix --
> 
> Whether it works that way or not is semi-irrelevant.  I'm with Jaromir
> on this: if that's not how it's documented as working, code that
> assumes it works that way is broken and needs fixing.  (Or, at the very
> minimum, the documentation needs fixing.)
> 
> Is there a standard specifying how AF_LOCAL sockets work (POSIX maybe)?
> Does it say anything about this?

For TCP, POSIX defines a 'backlog' count of the number of connections
that can be completed (ie a SYN/ACK sent) without the server application
having called accept().  I thought the number was set by setsockopt(),
but can't find any references.  However one of the X/Open test suites
used to (try to) test the limit [1].

At least one company (well known to many of us, and for whom I've
never worked) was failing that test because they were using more
complicated algorithms to avoid problems on large servers.

Such thing as:
- discarding SYN packets when the queue is full.
- deferring any processing of the SYN packet when the queue is full
  until a predetermined timeout.
spring to mind.
Both have the effect of reducing the amount of traffic and the amount
of data that must be held for each partial connection.

The size of the queue (so_backlog) might be dynamically changed depending
on the average rate at which connections are accepted.

Under 'no load' conditions the connect() should complete before the
server side calls accept().  However under load I believe it is allowed
to block on the server process.

	David

[1] I was at the XNET meeting when this came up....

-- 
David Laight: david@l8s.co.uk