Subject: Re: FIONWRITE proposal
To: NetBSD Kernel Technical Discussion List <tech-kern@NetBSD.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 10/20/2004 10:06:44
--45Z9DzgjV8m4Oswq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 19, 2004 at 08:46:08PM -0500, David Young wrote:
> On Mon, Oct 18, 2004 at 01:53:26PM -0700, Bill Studenmund wrote:
> > The problem is that the reservation would need some sort of ID. The=20
> > problem with your suggestion as-is is that other writes can get injecte=
d=20
> > between the FIONSPACE call and the write(2) which we wanted to get the=
=20
> > reserved space. So even though FIONSPACE reported room, the write that =
we=20
> > wanted to have work won't.
>=20
> I had assumed you would do the ioctl and write atomically.  I don't see
> how the other discussed ioctls avoid this problem.

They don't avoid this issue, however they provide enough information that=
=20
my application can make a choice as to how to proceed.

> I can see the virtue in an API that you can re-use with ttys, files,
> but a trivial modification to the socket API won't do?  I believe it was
> Charles Hannum who suggested on another venue that you could add a flag
> to sendto(2) that has the desired semantics---MSG_ATOMIC?

No.

Those aren't the semantics I desire. Those are the semantics that Charles
thinks I should desire and that should do what I want. I do not think they=
=20
will.

Problems:

1) It hasn't mattered until this point in the discussion, but I actually=20
use writev(2) for the write. Thus we would need a sendv(2) system call.=20
Thus this proposal has already gotten more complicated - it needs a new=20
system call.

2) There is a fair bit of connection state that would need unwinding in=20
case of a failure. It also would all be special cased just for this=20
scenario. Remember that the common case wants to just block and wait, and=
=20
if there's an error, it means the tcp connection is going down (and thus=20
no unwinding needed). Also, as I mentioned earlier, there is a mutex which=
=20
is locked while the writing thread is in the kernel. As blocking on that=20
mutex is as bad as blocking in the write, the only place where the=20
"ATOMIC" flag would help would be if the preceeding write had exactly=20
filled the send queue. It would have returned yet there would be no space.=
=20
Otherwise, I end up sleeping on the mutex, which is as bad as blocking on=
=20
the write.

Take care,

Bill

--45Z9DzgjV8m4Oswq
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFBdpskWz+3JHUci9cRAvMZAKCCo6d2ETGUOa69q5z80rfXW5g+wwCfbvMa
YdhRI++4Uvde0prdG4tFX8g=
=OEez
-----END PGP SIGNATURE-----

--45Z9DzgjV8m4Oswq--