tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: vmem(9) (was Re: netbsd-6: pagedaemon freeze when low on memory)

Lars Heidieker <> writes:

> On 2013-03-19 02:11, Greg Troxel wrote:
>> How hard do you think it would be to make pool_drain() keep trying pools
>> until one succeeded in freeing something (or it tried them all)?  Do you
>> see any downside in that change?  It seems like it's just as well to
>> more aggressively try to free memory when we are out.  If a pool frees
>> something, it will stop, so it's only when pools do not give back any
>> space that it will take longer.   The round-robin nature seems to be
>> built in.  It seems perhaps tricky to retain the round-robin nature and
>> allow a full cycle; it's not obvious to me that just remembering the
>> current next pointer is ok, but I think it is.
> This is an option and looks better to me, a slight downside is we will
> get more aggressive on pool draining in the normal case (non kva
> shortage but physical ram) we used to drain one pool in a roundrobin way
> with the change we drain until we actually drain something so we will
> trade slightly higher cpu overhead for less unsued memory when there are
> a lot not drainable pools around which is likely in such situations.
> Well one could see this as an advantage as well.
> (there is a timeout in the pool for empty items, so no "ping pong")

This is perhaps overly complex, but it seems like all of the drain
routines need some sort of "how hard" parameter, because freeing cached
objects that haven't been accessed in 1000s doesn't really hurt, and as
you bring the stale lifetime down to 0s it begins to hurt more and be
thrashing.  So maybe that parameter really is in seconds.  This seems
consistent with the intent in the Bonwick paper, which does not explain
the strategy behind the back-end freeing mechanism (pool_drain, in our

> The question that remains is should we seperate those two cases, if not
> your suggestion is just right.
> We could simplify on the waiting in uvm_pageout then:
> remove the kmem_va_starved from the if around UVM_UNLOCK_AND_WAIT
> and check for uvm_km_va_starved once after wakeup and if true call (the
> changed) pool_drain.
> If we figure out we should seperate those cases, I'll make a callback
> chain for the kva case and we switch pool_drain back to it's old behaviour.

If we are short on pages but not kva, we may still want to drain pools
(pools are non-pageable, right?), to recover pages that are not really
in use, rather than only applying pressure to process pages.

If we are short on kva, but not pages, then it doesn't make sense to try
to free process pages by paging them out.

As I read the code, I think it gets this right.

So the real issue amy just be that vnodes don't get drained.

(Our system allocates a huge amount of kmem (100s of MB), not via a
pool, for an odd purpose.  It works fine until lots of vnodes are
created from the daily find.  But the pool drain attempts do not end up
freeing kva from freeding vnodes in the vnode pool, which it seems they
should.  Obviously we need to bump the size of the kmem arena, but there
are still lurking issues here.)

> In both cases we might want to include a check if the starvation
> persists during draining and continue if so.

So my not-well-considered idea is to have a parameter for draining that
starts out at 1024s, and as soon as we are low on pages or kva, drain
gets run on all pools with that threshold (and we run the clock hand
with that interval??) until things are ok.   If we get all the way
around, the threshold is dropped to 512s, and so on, until it's 1s and
finally 0s.  Somehow it needs to go back up as time passes without

>> Do you think that the patch's change to sleep after a failed drain will
>> cause a system to behave particularly badly?  It's not clear to me how
>> the pagedaemon gets woken up, and if it requires a new lwp to be wanting
>> memory.
> The problem is we can't progress without freeing something, not
> busylooping might give the chance to free something, so is slightly
> better I guess.

The pagedaemon can't make progress, but it seems better to let the rest
of the system perhaps run code that might free things, rather than not
letting it.

I'm trying to avoid turning a livelock fix into a requirement for
massive redesign.   I'm not opposed to you doing a big change, but if
that isn't near term, I still think we should address this livelock.

So I wonder if a modified version that puts the pagedaemon to sleep for
1s when it's starved and not making progress is a good compromise.  That
way it keeps working, but lets other code run.  Or maybe it should just
be a panic.

Attachment: pgptQzb7hDa9P.pgp
Description: PGP signature

Home | Main Index | Thread Index | Old Index