tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: select/poll optimization

On Thu, Feb 28, 2008 at 03:00:44PM +0000, Mindaugas Rasiukevicius wrote:
> As discussed with Andrew, here is the per-thread approach with the array of
> descriptors to store state:

Another unreadble diff .....
It would be easier to show us the new function in its entirity.
Or at least order the functions so that diff doesn't compare two
different functions.

> My arguments for this way:
> - Locking is more fine-grained, and I think it would perform better in systems
>   with >= 8 CPUs. That is a long-term solution.
> - Only selector LWP is awoken, there are no case of collision like in current
>   implementation, when all selecting LWPs are awoken to re-check the state.
> - Our second goal is to avoid calling selscan/pollscan twice, for that we
>   need per-thread storage anyway.

I'm not sure the collision case is worth optimising for. I've always
liked the fact that our poll/select code manages not to have to link
a per-caller memory block onto each driver area.

Although you save the 2nd selscan/pollscan call, you have to unlink
the data blocks - which is arguably a more expensive operation - at
the end of the call.

You need to allow for more than one event being returned (for a single
fd) since it can be a while before the process returns.

Since the size of the array passed to poll() is unlimited, you are
allowing a user process to allocate an unbounded amount of kernel
memory. (There is a long-standing bug about the fact we bound the
array to RLIMIT_NOFILE, in fact we should probably process the
array in chunks.)

In the uncontested case the signalled event could be written into
the per-device area - saving the driver code some work.

Haven't you added another mutex?
The last version I (tried to) read relied on the driver mutex for
one of the structures.  This was acquired twice per fd.
I think you acquire the driver mutex once, and another mutex twice.


David Laight:

Home | Main Index | Thread Index | Old Index