tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: racy acccess in kern_runq.c



Le 06/12/2019 à 10:00, Andrew Doran a écrit :
> Hi,
> 
> On Fri, Dec 06, 2019 at 03:52:23PM +0900, Kengo NAKAHARA wrote:
> 
>> There are some racy accesses in kern_runq.c detected by KCSAN.  Those
>> racy access messages is so frequency that they cover other messages,
>> so I want to fix them.  They can be fixed by the following patch.
> 
> Please don't commit this.  These accesses are racy by design.  There is no
> safety issue and we do not want to disturb the activity of other CPUs in
> this code path by locking them.  We also don't want to use atomics either. 
> This code path is extremely hot using atomics would impose a severe
> performance penalty on the system under certain workloads.

With 'worker_ci', there is an actual safety issue, because the compiler could
split the accesses and the hardware may not use atomics by default like x86.
This could cause random page faults; so it needs to be strictly atomic.

Apart from that, yes, there is no other safety issue and locking would be too
expensive. All we possibly care about is making sure the accesses aren't split,
for the sake of not basing the scheduling policy on systematic garbage, and
atomic_relaxed seems like the right thing to do (Kengo's 2nd patch),
considering that it costs ~nothing.

> Also if you change this to use strong ordering, you're quite likely to
> change the selection behaviour of LWPs.  This is something delicate that
> exhibits reasonable scheduling behaviour in large part through randomness
> i.e by accident, so serialising it it's likely to have strong effects on how
> LWPs are distributed.
> 
> Lastly I have a number of experimental changes to this code which I'll be
> committing in the near future allowing people to change the LWP selection
> policy to see how if we can improve performance under different workloads. 
> They will also likely show up in KCSAN as racy/dangerous accesses.
> 
> My suggestion is to find a way to teach KCSAN that we know something is
> racy, we like it being racy, and that it's not a problem, so that it no
> longer shows up in the KCSAN output.

Maybe we should indeed have a macro to say "yes this access is racy but we
don't care". Currently this macro is atomic_{store,load}_relaxed()...




Home | Main Index | Thread Index | Old Index