tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Solving the last piece of the uvm_pageqlock problem
This is a diff against a tree containing the allocator patch I posted
previously:
http://www.netbsd.org/~ad/2019/pdpol.diff
The idea here is to buffer updates to the page's status (active, inactive,
dequeued) and then sync those to the pdpolicy / pagedaemon queues regularly,
a bit like the way the file system syncer works. Notes:
- Since uvm_pageqlock was replaced with pg->interlock & a private lock for
the pdpolicy code, pages can occasionally appear on a pdpolicy queue when
they shouldn't be considered for pageout & reclaim (if the pagedaemon and
object owner race), but it's not a problem because the pagedaemon can take
pg->interlock and determine that a page is wired or in a state of flux or
whatever, and so should be ignored because it'll be gone from the queues
soon.
- This patch takes it a little further. The pdpolicy code gets a dedicated
TAILQ_ENTRY in struct vm_page so it doesn't need to share with the page
allocator. A page can be PG_FREE and still on a pdpolicy queue (but not
for long). We set an intended state for the page on pg->pqflags using
atomics (active, inactive, dequeued) and then those pages are queued in a
per-CPU buffer for their status updates to be purged and made real in the
pdpol code's global state at some point in the near future.
- The pagedaemon can also see those updates in real time by inspecting
pg->pqflags and make real the page's status. So basically what I'm doing
is batching the updates, trying to not let the global state fall too far
behind, and always give the pagedaemon enough information to know the true
picture for individual pages when it does its labourious scan of the
queues, even if viewed globally the queues are a little bit behind.
This seems to work really well, I think because a page can have multiple
state transitions while it's in a queue waiting for its intended status
change to be purged and made global.
Shortly before composing this e-mail it occurred to me that FreeBSD may do
something similar but to be honest I didn't dig into their code.
I need to tweak this to allocate a smaller buffer for uniprocessor systems
and maybe consider using prefetching instructions when purging, and want to
re-run the tests because I changed a couple of things but I'm basically
happy with it.
Results on my kernel build test:
72.66 real 1653.86 user 593.19 sys new allocator
71.26 real 1671.13 user 502.94 sys new allocator + pdpol.diff
Lock contention before and after:
Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
28.86 44056935 77553.77 pdpol_state <all>
15.62 22177251 41978.93 pdpol_state uvmpdpol_pageactivate+36
13.12 21656129 35251.99 pdpol_state uvmpdpol_pagedequeue+18
0.12 223482 322.77 pdpol_state uvmpdpol_pagedeactivate+18
0.00 73 0.07 pdpol_state uvmpdpol_pageenqueue+18
Total% Count Time/ms Lock Caller
------ ------- --------- ---------------------- ------------------------------
0.23 11301 362.35 pdpol_state uvmpdpol_pageintent_set+b9
Andrew
Home |
Main Index |
Thread Index |
Old Index