On 03/17/10 17:53, Sad Clouds wrote:
1. Accepting connections On busy web server it is likely that the listening socket will have multiple entries on the completed connection queue (i.e. established state). You call accept() in a loop until you get EWOULDBLOCK and since these are new connections, the chances are all these sockets contain HTTP GET requests for /index.html Instead of calling write() 50 times for all 50 new sockets, you could just call write2v() once.
So basically, between the 1st accept(2) and the last, all the clients are waiting for input (which they will get sequentially, when you will perform your write2v() after the _last_ accept(2)).
Which means that the 1st "accepted" filedes created will wait for the 50th to be accepted. Seems inefficient for me.
Food for thought: how do you expect your write2v() call to handle blocking vs non blocking I/O? In case of a blocking one, should the write2v() call return anyway?
2. Sending similar requests When the server is handling large number of connections, there is a pretty good chance that some of those connections will request the same data for the same popular web resources. You have a queue of active connections and every time you go in a loop servicing those connections, you check your cache for valid data. Whenever you have a cache hit, you mark it and aggregate multiple requests for the same file into a single reply queue. Again, instead of calling write() multiple times, you could issue a single system call to write a set of buffers to multiple sockets.
You make one client wait for the others; this relies on an unverifiable assumption (predicting future?), and delay things. Human beings are sensible to unpleasant lag.
This might make a big difference to the overall system time and dramatically reduce load.
If you want to achieve such "parallelism", just play with mmap(), fork and threads. The kernel will do the caching for you (if your resource is called sufficiently enough, I can't see how the LRU behind will discard it...), and the only induced overhead would be the context switch for the write(2) syscall (depends on the reentrancy of the OS you use).
Should the context switch overhead become unpleasant for you: roll out your own in-kernel server system, and syscalls will become function calls.
And pray that you don't have too many buffer overflows flying around your code (or any other exploit).
Cheers :) -- Jean-Yves Migeon jeanyves.migeon%free.fr@localhost