Subject: Re: NetBSD1.6 UVM problem?
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Oleg Polyanski <Oleg.Polianski@team.telstraclear.co.nz>
List: tech-kern
Date: 12/11/2002 00:19:12
>>>>> "der" == der Mouse writes:

 der> You call mlockall(), or if you can, mlock(), to lock all
 der> relevant pages into core.  Or else you impose resource limits
 der> low enough compared to the amount of swap you provide that you
 der> _can't_ run out.

 der> ...or else you risk, yes, as you say, getting a critical
 der> process nuked at an unpredictable time.

  `mlock' only provides you with a way to find out if there is still
  any physical memory available for allocation in the system.
  Moreover, not every VM use case requires locking of all the pages
  in RAM. By using `mlock' you effectively prevent a process or its
  certain pages from being paged out. If that process keeps
  naturally growing, not because of a memory leak buried somewhere
  deeply inside, this approach is rather ineffective, especially
  when the process doesn't need to have all its pages permanently
  residing in RAM. Think of databases, for example. Yes, they do
  lock in some pages. No, they don't pin down their every page.

 >> You end up with a killed process and perhaps lost data only
 >> because your kernel could not say anything about memory
 >> starvation rather killing the first found process happened to be
 >> the largest one.

 der> Right.  But what else is there to do?  I can see only about
 der> four reasonable things to do when you're out of RAM when
 der> servicing a page fault:

 der> 1) Deliver a signal to the process.
 der> 2) Kill the process.
 der> 3) Stall: make the process wait a little and hope the
 der>    situation eases.
 der> 4) "When in danger or in doubt / Run in circles, scream and
 der>    shout": stall the faulting process and kick someone else,
 der>    probably a task-manager of some kind (presumably small and
 der>    locked in core).

 der> For (1) to be useful, the process must have arranged to handle
 der> the signal without incurring further page faults, which is
 der> possible but nontrivial (and even more difficult to do without
 der> just locking everything in core, in which case you won't get
 der> the page faults anyway).  Most processes don't have much they
 der> can do to ease a serious RAM crunch to start with; to do it
 der> right you really want a malloc-alike that lets you specify
 der> different areas to allocate out of for different objects, so
 der> noncritical objects and critical objects can live in different
 der> pages.

  (2) we've got now and it's acceptable, (3) and (4) don't look
  feasible to me in most cases, while a sensible combination of (1)
  and (2) was described in my last email.

  (1): Locking of the signal handler code shouldn't be a problem,
  it's relatively small when not abused. The real benefit of the AIX
  approach is that the SIGDANGEROUS signal is serviced only by the
  software that really wants to be aware of VM starvation, for
  everybody else it's simply invisible and ignored by default.

Oleg