tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Writing to multiple descriptors with one system call

On Wed, 17 Mar 2010 16:22:44 +0000
Sad Clouds <> wrote:

> On Wed, 17 Mar 2010 16:01:28 +0000
> Quentin Garnier <> wrote:
> > Do you have a real world use for that?  For instance, I wouldn't call
> > a web server that sends the same data to all its clients *at the same
> > time* realistic.
> >
> Why? Because it never happens? I think it happens quite often. Another
> example is a server that is sending live data, i.e. audio playback,
> video stream, etc. If you can't use multicasting over a WAN, then you
> have a situation where you are streaming the same data to large number
> of clients.

In the past I wrote a custom httpd which read broadcast security camera
frames from LAN to broadcast them over connected HTTP clients, and
since clients remain connected with keep-alive, it has to iterate
through connections to send in new packets.

However, clients which cannot cope with the sending speed are
"throttled" so that some packets are skipped, which makes things a
little more complex than simply using a "send this message to all

kqueue(2)/kevent(2) were used for polling, and in my case the available
bandwidth was always the bottleneck, however.

I also have a question: did your test really use non-blocking sockets
for writing, and an efficient polling mechanism like kqueue or libevent
used, while disabling write polling when the sendq is empty, enabling
it back when there's data to send, and only sending data when a poll
event indicates that write is allowed?  Otherwise, I assume that the LWP
would lock on write(2).

If a "broadcast" writev(2) to multiple FDs variant existed, it possibly
would have to present an interface similar that of kevent, or be tied
as a new protocol over kqueue, because of the FD specific
errors/events...  libevent for instance also supports transfer buffer
queues and could possibly be adapted to support such a feature too.
However I'm also unsure if this would really help or just move some
userland and syscall overhead up to kernel overhead and achieve a
similar overall performance.  A test implementation might indeed be
needed, to really know :(

Home | Main Index | Thread Index | Old Index