tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: performance issues during -j 40 kernel

On Sat, Sep 09, 2017 at 08:48:19PM +0200, Mateusz Guzik wrote:
> 1) #define UBC_NWINS 1024

Yes, this one should scale automatically. Needs a bit thought about that
a good scaling would be.

> 2. uvm_pageidlezero

I disagree on this, a lot. At best it is a band aid unless the
uvm_f?pageqlock handling is fixed. Not that unlike FreeBSD, this has
been using non-temporal stores for ever, so it has very little
additional cacheline traffic beyond the free queue interaction. While it
doesn't help on a completely busy system, it does provide value for any
system that is even occassionally.

> ----------------
> ffffffff810b8fc0 B uvm_swap_data_lock
> ffffffff810b8fc8 B uvm_kentry_lock
> ffffffff810b8fd0 B uvm_fpageqlock
> ffffffff810b8fd8 B uvm_pageqlock
> ffffffff810b8fe0 B uvm_kernel_object
> ----------------
> All these locks false-share a cacheline. In particular fpagqlock is
> obstructing uvm_pageqlock.

That's true, but changing this also has quite a significant downside on
some workloads for second order effects. I don't think it is a good idea
to change this right now, as it doesn't even fix the real problem.

> Doing #if 0'ing the uvm_pageidlezero call in the idle func shaved about 2
> seconds real time:
>   589.02s user 792.62s system 2541% cpu 54.365 total

There is a sysctl for it, you know?

> Followed the issue noted earlier I __cacheline_aligned aforementioned
> locks. But also moved atomically updated counters out of uvmexp.

Actually, most of them should be switched to localcounter.


Home | Main Index | Thread Index | Old Index