tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: performance issues during build.sh -j 40 kernel



Thanks for this analysis. I have three remarks:

Le 09/09/2017 à 20:48, Mateusz Guzik a écrit :
[...]
I installed the 7.1 release, downloaded recent git snapshot and built the
trunk kernel while using config stolen from the release (had to edit out
something about 3g modems to make it compile). I presume this is enough
to not have debug of any sort enabled.

Not sure I understand; did you test a kernel from the netbsd-7.1 branch, or
from netbsd-current? You might want to test netbsd-current, I know that several
performance-related improvements were made.

[...]
Here it turned out to be harmful by inducing avoidable cacheline traffic.

Look at nm kernel | sort -nk 1:
----------------
ffffffff810b8fc0 B uvm_swap_data_lock
ffffffff810b8fc8 B uvm_kentry_lock
ffffffff810b8fd0 B uvm_fpageqlock
ffffffff810b8fd8 B uvm_pageqlock
ffffffff810b8fe0 B uvm_kernel_object
----------------

I saw exactly this too a few months ago. In fact, there is a certain number of
places that generate huge false sharing. Typically, the xpq_idx_array[MAXCPUS]
array in Xen. I've fixed only few of them, but it is clear that they should all
be taken care of.

[...]
3. pmap

It seems most issues stem from slow pmap handling. Chances are there are
perfectly avoidable shootdowns and in fact cases where there is no need
to alter KVA in the first place.

This seems rather surprising to me. I tried to reduce the number of shootdowns
some time ago, but they were already optimized, and my attempts just made them
slower to process. The only related thing I fixed was making sure there is no
kernel page that gets flushed under a local shootdown, but as far as I
remember, it didn't significantly improve performance (on a somewhat old
hardware, I must admit).

I'll take care of some of the false sharing soon.

Maxime


Home | Main Index | Thread Index | Old Index