Subject: Re: Page reactivation path in pdaemon -- time to remove it?
To: Jason Thorpe <thorpej@wasabisystems.com>
From: Daniel Carosone <dan@geek.com.au>
List: tech-kern
Date: 02/27/2004 09:39:48
--uGrbifh0UoBBidu/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Feb 26, 2004 at 01:23:26PM -0800, Jason Thorpe wrote:
> This actually implements the=20
> two-handed-clock algorithm the way I think was intended:
>=20
> 	1. Active pages are scanned.  Pages that have not been referenced
> 	   since the last active scan are moved to the inactive queue.
>=20
> 	2. Inactive pages are scanned.  Pages that have been referenced
> 	   since being put on the inactive queue are moved back to the
> 	   active queue.  Otherwise, the page is cleaned / freed.
>=20
> The idea is that this finds your working set and keeps it in core.

Sound idea, and probably worth keeping.

> Basically, if you have some VERY ACTIVE files that consume nearly
> all physical memory, those pages will always be reactivated.  [and
> thus not be considered for removal even if they're above the high
> water mark for that page type]

> Does anyone have any thoughts on how to fix this problem?  I'm inclined=
=20
> to kill the reactivation path completely, but I would prefer reactivate=
=20
> an actually active page of a certain type than an inactive one.

If you kill the reactivation path completely, you may end up evicting
a should-be-active page over one that's really stale.  And perhaps
even doing that again when you need to page it back in again shortly
after. =20

But under your same busy-big-file scenario, why would pages for the
file ever make it to the inactive list on the first clock hand?  If
the file is really that busy, file pages will stay at 80% even if the
high water mark is 50%, because nothing ever gets to the inactive list
to be freed.

By killing the reactivation path, you might scavenge a few file pages
that make it to the inactive list, but it seems like fighting for
scraps with a glutton. More than likely, the only pages to make it to
an inactive list will be non-file pages, which you'll free in
desperation and decrease your chances of ever getting the rampaging
file monster back down to size.

One modification to your proposal would be to avoid the reactivation
path on the second sweep, but only if the page type is over high
threshold.  That way, anons or execs can still be reactivated, and
files will be forcefully freed. There's better pressure to come into
balance, but you might still end up throwing away precisely the wrong
file pages that are about to be hit again.

Another idea would instead be to change the first hand - if you're
over threshold for a certain type, be more aggressive tossing pages
onto the inactive list, but let them still be rescued from there if
they're touched soon after. Can we have a different clock rate for
each page type, so we give (in this case file) pages more frequent
checking for activity if they're over threshold?  Or does the hand
sweep around all pages and find out what type they are when it gets
there - in that case, perhaps randomly throw pages to inactive even if
they've been touched and hope they get rescued if we picked a bad one?

Or perhaps even some combination of these is warranted.

Either way, I think the hands need to learn about the per-type
thresholds and change behaviour accordingly. Do we want that knowledge
in two places?

--
Dan.
--uGrbifh0UoBBidu/
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (NetBSD)

iD8DBQFAPnW0EAVxvV4N66cRAmDTAKCM+TGfE08hV5cKfkN7pXDWQnQ+jACgyw5L
UCWJtaksStDSmJU8oGmZdww=
=DTt+
-----END PGP SIGNATURE-----

--uGrbifh0UoBBidu/--