tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: parallel (free) pagequeue



Summarising our IRC discussion:

"someone" (aka joerg):
hashing on paddrs is probably subpar, it's as much work but without
balancing.
the free pagequeue is a good first goal, and contention on it leads
to contention on the non-free pagequeue too.
frustrated we have per-CPU and global free page queues, but a global
lock. we are lacking a stealing mechanism for real per-CPU.

current: global and per-CPU free lists. global lock.
idea #1:
small local unlocked array, large local locked array, no global.

when grabbing a free page:
try local unlocked
  -> if failed, refill from local locked, retry
     -> if failed, refill (both locals, unlocked and locked)
        from non-local locked, retry

when putting:
Put in local unlocked
  -> local unlocked filled up, put some of our local unlocked in local
     locked.


maybe written a bit weird but this is because the code is written with
'retry' now.

- need to continue handling page colors.
- need to pin to cpu to avoid being migrated halfway.
- benchmark values selected (maybe use high/lo water marks, how many to
  steal / put)
- Need to think more about VM_NFREELIST > 1 and uvm physseg

idea #2: one local storage, xcall to steal pages (probably more
expensive but it came up).


Existing users of uvm_fpageqlock:
- uvm_pagealloc_strat: get
- uvm_pagefree: put

??
- uvm_pglistalloc_contig: get
- uvm_pglistalloc_simple: get
- uvm_pglistfree: put



Alternate users which must be adapted once it's not one lock:
- mtsleep for uvm_pagedaemon, pageout sleepers
- uvm_pageout_start: synchronizes access to uvmexp.paging
- uvm_pageidlezero: start with local (do others?)
- uvm_page_recolor: called from some MD code at CPU spinup to distribute
  different colors, seems almost immediate to switch this to use local
  lock though it touches the global list for some reason.


riastradh:
vmobjlock is heavily contended (for e.g. libc.so), and uvm_pagelookup
can be made into a pserialized radix tree.


background: this is a flame graph** during very heavy build.sh, with
minimal other activity: http://coypu.sdf.org/buildsh.svg

** https://github.com/brendangregg/FlameGraph



Home | Main Index | Thread Index | Old Index