Subject: Re: Proposal: socketfrom()
To: None <tls@rek.tjls.com>
From: Darren Reed <darrenr@netbsd.org>
List: tech-net
Date: 07/05/2007 20:51:06
Thor Lancelot Simon wrote:
> I have an application that makes outbound TCP connections at a very high
> rate, so high that the overhead of additional system calls to set socket
> options considerably impacts performance.
>
> I could partially address this by adding a system call that sets multiple
> socket options at once (which, I think, would be a better API than
> setsockopt() anyway) but that gets rid of _all but one_ system call to
> set up the socket before connect(); I want to get rid of them all.
>
> I'd like to make it possible to set options on one "template" or "master"
> socket and then have them inherited by children, as listen()/accept() make
> possible for the other direction.  I'm thinking of something along the lines
> of this:
>
> int socketfrom(int template, int domain, int type, int protocol);
>
> Which would return a new socket using the socket options already set on
> socket "template".  If domain, type, and protocol don't match, this is
> an error (or perhaps it would be best to omit them entirely and just
> have one argument, the template socket.
>
> Opinions?
>   

Thor, if someone came to you and said they wanted to add this
system call to NetBSD, what would your reflex reaction be?

Wind back the clock n years, or whatever it takes to be at a point
in time where you didn't have this problem.  Would your instinctive
reaction be "yes, add a new system call"?  And if so, would you
accept the proposal given so far?

To put your proposal in proper context:
- you have a specialised app and haven't given us any details about it
- you have a problem that nobody else does (that we're aware of)
AND
- you want to add a new system call to make *it* faster.

Forgive me for being cynical, but I think if some ordinary/unknown
user came along and presented the same case you are, the response
would be a bit different.

That said, it's stupid to ignore the idea that there is a problem
here that needs attention.

The proposal that far has been to create a system call that clones
a socket - well almost.  It clones *only* the socket options.  What
would happen if it is called after bind()?  Or even after connect()?
Are addresses copied over too?  If they weren't, is that an intuitive
leap from the API presented?  Or are there different failure modes
introduced if called aftre bind/connect/listen?

How different is the behaviour of one of these calls to using dup(2)
on an unbound/unconnected socket?  Is there some reason that
dup(2) shouldn't work as desired here?  If it did, would that break
applications that use dup(2) today with sockets?

In the case of dup, the usual problem is that it just creates a new
reference (fd) to the same file as the original fd.  By this line of
thinking, creating a socket_dup() or even dup_socket() seems
like a confusing path to take.

The idea with most merit that I've seen is to be able to save and
restore socket option state.  Save it into a binary blob using a call
to getsockopt() and apply it to another with setsockopt().  I think
there's much more programmability and usefulness with this model
than any other - it lets me apply the socket options to fd's that get
passed in from other processes amongst other things.

The only downside from Thor's point of view is that this would still
be 2 calls, and not 1, to create the socket - but still less than the 4
or 5 (guess) he is likely to be using now.  Yes, the socket options
may not be documented well, but if the properly architected solution
to this problem lays down that path, the state of documentation should
not be a barrier.

Comments?

Darren