[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Writing to multiple descriptors with one system call
On 03/17/10 17:53, Sad Clouds wrote:
1. Accepting connections
On busy web server it is likely that the listening socket will have
multiple entries on the completed connection queue (i.e. established
state). You call accept() in a loop until you get EWOULDBLOCK and since
these are new connections, the chances are all these sockets contain
HTTP GET requests for /index.html
Instead of calling write() 50 times for all 50 new sockets, you could
just call write2v() once.
So basically, between the 1st accept(2) and the last, all the clients
are waiting for input (which they will get sequentially, when you will
perform your write2v() after the _last_ accept(2)).
Which means that the 1st "accepted" filedes created will wait for the
50th to be accepted. Seems inefficient for me.
Food for thought: how do you expect your write2v() call to handle
blocking vs non blocking I/O? In case of a blocking one, should the
write2v() call return anyway?
2. Sending similar requests
When the server is handling large number of connections, there is a
pretty good chance that some of those connections will request the same
data for the same popular web resources. You have a queue of active
connections and every time you go in a loop servicing those
connections, you check your cache for valid data. Whenever you have a
cache hit, you mark it and aggregate multiple requests for the same
file into a single reply queue. Again, instead of calling write()
multiple times, you could issue a single system call to write a set of
buffers to multiple sockets.
You make one client wait for the others; this relies on an unverifiable
assumption (predicting future?), and delay things. Human beings are
sensible to unpleasant lag.
This might make a big difference to the overall system time and
dramatically reduce load.
If you want to achieve such "parallelism", just play with mmap(), fork
and threads. The kernel will do the caching for you (if your resource is
called sufficiently enough, I can't see how the LRU behind will discard
it...), and the only induced overhead would be the context switch for
the write(2) syscall (depends on the reentrancy of the OS you use).
Should the context switch overhead become unpleasant for you: roll out
your own in-kernel server system, and syscalls will become function calls.
And pray that you don't have too many buffer overflows flying around
your code (or any other exploit).
Main Index |
Thread Index |