Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

ACPI idle performance problem



ACPI idle is a choke point which can be observed hammering the CPU and L2/L3
cache with tprof(8).  It probably also causes delays with TLB shootdown IPI
processing.  Below are some kernel compile times from a 2 x 12 core system
running a kernel from the ad-namecache branch, comparing MWAIT & ACPI idle:

-j24 (no HT)    MWAIT   83.57s real  1016.03s user   205.83s system
-j24 (no HT)    ACPI    88.64s real  1067.68s user   221.18s system

-j48            MWAIT   74.25s real  1582.42s user   367.20s system
-j48            ACPI    77.26s real  1564.38s user   368.41s system

To solve the problem, my initial thoughts are:

(1) For each CPU, make use of ACPI idle only if all CPUs in the same CPU
    core have been idle for N clock ticks; otherwise use MWAIT/HLT.

(2) Remove CPUs doing ACPI idle from participation in TLB shootdown IPIs,
    although that may impose its own cost because it means using directed
    IPIs instead of broadcast IPIs.  Needs to be tried.

But, I'm not really sure about this one.  Should it be that the decision to
use ACPI idle should be system wide, and not a per-CPU/per-core decision? 
Am I missing some vital piece of information, and/or is there a better way
to solve this?

Thanks,
Andrew


Home | Main Index | Thread Index | Old Index