Subject: Re: splx() optimization [was Re: SMP re-eetrancy in "bottom half" drivers]
To: Jason Thorpe <thorpej@shagadelic.org>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 06/10/2005 16:06:14
In message <60690920-E90A-4F2F-B208-427E9D4E89B8@shagadelic.org>,
Jason Thorpe writes:
>
>On Jun 9, 2005, at 11:59 PM, Daniel Carosone wrote:
>
>> I rather like the way Xen's request-ring structure works, for solving
>> a not-dissimilar problem of abstracting hardware driver interfaces
>> across the VMM while minimising contention between producers and
>> consumers (two sets of ring-state pointers). Perhaps something
>> similar would be suitable for feeding hardware interrupt results into
>> within-kernel_lock code.
>
>This is also not dissimilar to how host -> NIC communication rings
>work. Unfortunately, for them to work reliably, you still need inter-
>processor synchronization ... just not locks; instead, you need to
>make sure that memory barrier operations are performed in the right
>places.
But it's far from clear whether this will help or hurt.
Look, here you're basically proposing trading a synchronization op (at
whatever hypothetical granularity we need synchronization), for a
memory barrier. There are microarchitectures (CPU pipelines, if you
will) where atomic memory-op acts on a single cacheline and doesn't
imply any memory barriers. If your pipeline is also OOO, memory
barriers could be _way_ more expensive.
I dunno about anyone else, but I'm getting the feeling we keep hashing
and rehashing what is, basically, college-level material, without
acutally *getting* anywhere. I injected some numbers partly to try
and lift the discussion out of that rut; but here, that doesn't seem
to have worked. Sigh.