tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: select/poll optimization



On Fri, Mar 21, 2008 at 07:56:53PM +0900, YAMAMOTO Takashi wrote:

> > On Thu, Feb 28, 2008 at 03:00:44PM +0000, Mindaugas Rasiukevicius wrote:
> > 
> > > As discussed with Andrew, here is the per-thread approach with the array 
> > > of
> > > descriptors to store state:
> 
> is there any benchmark result to compare these approaches?

Updated percpu/combined patches:

        http://www.netbsd.org/~ad/percpu-select.diff
        http://www.netbsd.org/~ad/combined-select.diff

Results from an 8-core box. The perthread patch crashed because the 'selfd'
handling is incorrect (fixed in the combined patch). Note that the combined
patch still also has bugs.

percpu:         744993 / 60 = 12416.550000
HEAD:           502141 / 60 = 8369.016667
combined:       224090 / 60 = 3734.833333

lockstat output for the above. I have changes that make descriptor access
lockless so a lot of the noise will disappear shortly.

        http://www.netbsd.org/~ad/percpu.txt
        http://www.netbsd.org/~ad/head.txt
        http://www.netbsd.org/~ad/combined.txt

> >     http://www.netbsd.org/~ad/combined-select.diff
> 
> >     for (sf = sl->sl_fd; sf < sl->sl_fd + sl->sl_fdcount; sl++) {
> 
> sf++ ?

Yes, found that one - thanks. After spending a lot of time on it I'm firmly
convinced that the two per-thread approaches are the wrong way to handle
this. They involve a lot more synchronization, they're really complicated
and there would be a additional cache pressure within the polling code
because of all the extra data processing involved. The per-thread way
handles collisions better, but I don't see how that is a major incentive.

Thanks,
Andrew


Home | Main Index | Thread Index | Old Index