tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Making global variables of if.c MPSAFE



On Wed, Jul 30, 2014 at 11:47 AM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> 
wrote:
> On Wed, Jul 30, 2014 at 3:33 AM, Dennis Ferguson
> <dennis.c.ferguson%gmail.com@localhost> wrote:
>>
>> On 26 Jul, 2014, at 22:38 , Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
>>> The global variables are read-mostly, so I
>>> replace the mutex with a rwlock and use it
>>> for all. Unfortunately, ifnet_list may be
>>> accessed from interrupt context (only read
>>> though) so that I add a spin mutex for it;
>>> we hold the mutex when we modify ifnet_list
>>> as well as the rwlock.
>>
>> I don't think this is a good way to do this.  I think
>> it would be better to either eliminate the need to
>> access that list from the packet processing path or
>> replace the structure with one which can be read while
>> it is being modified so that readers don't need to take
>> a lock on every access to prevent something (a write)
>> which is hardly ever going to happen anyway.  For the
>> latter it looks to me that most (all?) readers treat
>> the thing as a singly linked list which is about
>> the most straight forward structure to arrange to
>> support lockless readers.  A rwlock might be justifiable
>> for a structure which needs to be frequently written
>> as well as read (though it would be even better not
>> to have the packet processing path access structures
>> like that), but for a read-mostly structure it is way
>> better to eliminate all unnecessary work from the readers
>> even at the cost of greater complexity for the writer.
>
> We may be able to use pserialize(9) instead of rwlock(9)
> for lockless read accesses. Can we simply replace rwlock
> with pserialize?

Self-answer: no

pserialize(9) requires that we don't block (sleep) in
a critical section while rwlock(9) allows to do so.
And there is another undocumented constraint of pserialize,
IIUC. It requires lockless operations on data shared
between readers and a writer. (Correct me if I'm wrong.)

In the case of ifnet_list, both requirements are not
satisfied as it is. Adapting ifnet_list is not so easy
task, and so I think using rwlock(9) is reasonable
at this point.

...or rdlock(9) may be better?
http://thread.gmane.org/gmane.os.netbsd.devel.kernel/46314/focus=46339

  ozaki-r

>
> OTOH, in either case, we need a spin lock for interrupt
> context because neither pserialize nor rwlock can be
> used in interrupt context, IIUC.
>
>>
>> There's a bigger issue here, however.  Code which thinks
>> it is reasonable to scan the entire list of interfaces
>> looking for something is making the assumption that this
>> list is quite short.  That code will fail if this isn't
>> true.  While it is often the case that the number of
>> interfaces is small sometimes it isn't; this is particularly
>> true since, at the protocol level, an "interface" is
>> a software construct that doesn't map one-to-one to hardware
>> so you don't need to be running on a monster piece of
>> hardware (though I've seen those too) to have a lot of
>> "interfaces".  I've seen boxes with a small number of
>> 10 Gbps ethernet ports with >100,000 interfaces
>> configured (think a PPPoE concentrator), and even a
>> single ethernet with a bunch of VLANs can get the "interface"
>> count up to where it really doesn't work to do linear
>> searches.  I don't think there's a good reason to keep
>> code which fails on machines like that so I think at
>> some point it is going to be necessary to rethink what
>> interface structures do, and what the packet processing
>> path needs them to do, so it is no longer necessary to
>> have code like that.
>
> Yes, we have to think of scalability and optimize the code
> at some point. Nevertheless, most codes using IFNET_FOREACH
> in question are not called from packet processing that
> is performance sensitive. So we can optimize them after
> we encounter a performance problem.
>
>   ozaki-r
>
>>
>> Dennis Ferguson


Home | Main Index | Thread Index | Old Index