Subject: Re: splx() optimization [was Re: SMP re-eetrancy in "bottom half" drivers]
To: Jason Thorpe <thorpej@shagadelic.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 06/10/2005 16:06:14
In message <60690920-E90A-4F2F-B208-427E9D4E89B8@shagadelic.org>,
Jason Thorpe writes:

>
>On Jun 9, 2005, at 11:59 PM, Daniel Carosone wrote:
>
>> I rather like the way Xen's request-ring structure works, for solving
>> a not-dissimilar problem of abstracting hardware driver interfaces
>> across the VMM while minimising contention between producers and
>> consumers (two sets of ring-state pointers).  Perhaps something
>> similar would be suitable for feeding hardware interrupt results into
>> within-kernel_lock code.
>
>This is also not dissimilar to how host -> NIC communication rings  
>work.  Unfortunately, for them to work reliably, you still need inter- 
>processor synchronization ... just not locks; instead, you need to  
>make sure that memory barrier operations are performed in the right  
>places.

But it's far from clear whether this will help or hurt.

Look, here you're basically proposing trading a synchronization op (at
whatever hypothetical granularity we need synchronization), for a
memory barrier.  There are microarchitectures (CPU pipelines, if you
will) where atomic memory-op acts on a single cacheline and doesn't
imply any memory barriers. If your pipeline is also OOO, memory
barriers could be _way_ more expensive.

I dunno about anyone else, but I'm getting the feeling we keep hashing
and rehashing what is, basically, college-level material, without
acutally *getting* anywhere.  I injected some numbers partly to try
and lift the discussion out of that rut; but here, that doesn't seem
to have worked.  Sigh.