Subject: Possible VM gremlin
To: None <>
From: Charles M. Hannum <>
List: tech-kern
Date: 03/27/1999 05:40:54
I have just discovered something so unbelievably twisted and gross
that I declare it as `must be fixed for the release'.  It is probably
the source of swap-related lossage on multiple platforms.

To wit:

Many pmaps implement pmap_pageable() as `if the request is in the
kernel pmap, is exactly one page, and is a managed page, delete its
modified bit'.

It should be totally obvious why this is insane.  If you you have a
buffer locked with uvm_vslock() or a PCB locked in core (and
NBPG!=USPACE), and it's unlocked via uvm_vsunlock() or uvm_swapout(),
you will lose modifications to that buffer or PCB.  This is a SERIOUS

Now, you might wonder why some pmaps do this.  I'll tell you, but you
should have a barf bag handy when you read the following.

The reason they do this is so that they can use uvm_map_pageable() to
increment and decrement wiring counts on page table pages, and have
pmap_pageable() automatically mark those pages as not needing a
pageout!  (This way the page can be left allocated and will be GCed
automatically if memory is tight.)

Yup, that's right...

One answer would be to push the calls to pmap_pageable() into
uvm_map_pageable(), but in UVM that can be called via another path.
And besides, leaving such a special case path in the VM system at all
is just tempting fate.  Someone will misuse it and lose.

It seems to me that a better solution is to just use uvm_pagealloc()
and uvm_pagefree() to allocate pages, and for lack of another place,
overlead the wiring count in the vm_page to keep the reference count.
(Or possibly set PG_CLEAN where the wiring count drops to 0, rather
than freeing it.  I'm not sure this will allow it to actually be paged
out, though.)  And then nuke pmap_pageable() from orbit.

This will be faster than the current horrible kluge anyway, especially
when you can add or delete multiple pages at once, and it should have
the same desired effect, without burning down villages in its wake.