Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: aarch64 performance tweaks

>I made some more changes to reduce system time on aarch64 during compile
>jobs.  This is as far as I want to go here, I'm finished.  Review would be
>Time before & after for an MKCTF=no kernel build on an RK3399:
>	643.40 real 3140.42 user 532.59 sys
>	632.24 real 3159.67 user 455.31 sys


This is the quick note;

>- Remove memory barriers from the atomic ops.  I don't understand why those
>  are there.  Is it some architectural thing, or for a CPU bug, or just
>  over-caution maybe?  They're not needed for correctness.

Probably due to over-cautiousness.
There is (perhaps) still code that expects an implicit memory barrier.
However, I also agree with you about removing the memory barrier from atomic_ops.

>- Assembly language stubs for mutex_enter() and mutex_exit().

BTW, there are some examples about lock in the "Load-Acquire Exclusive,
Store-Release Exclusive and barriers" section of ARMARM. According to that,
it seems to recommend putting in "prfm ...". You can use it as a reference.
(and https://github.com/torvalds/linux/commit/0ea366f5e1b6413a6095dce60ea49ae51e468b61 )

ryo shimizu

Home | Main Index | Thread Index | Old Index