Subject: Re: Swapping and diskless systems
To: None <tech-kern@netbsd.org>
From: Christos Zoulas <christos@zoulas.com>
List: tech-kern
Date: 09/08/1998 11:29:49
In article <199809080940.FAA01854@lunacity.ne.mediaone.net> mycroft@mit.edu (Charles M. Hannum) writes:
>
>So I noticed a serious -- and somewhat funny -- problem with nfsiod
>and diskless systems yesterday.  We allow the u-area of the nfsiod
>process to be swapped out.
>
>This is a problem for two reasons:
>
>1) In the best case, it means that in a low memory situation NFS
>   performance will be severly hosed as we spend a bunch of time
>   paging in and out u-areas to free up memory.
>
>2) In the worst case, it will result in a deadlock.  (Can you say
>   `swapping out the swapper'?)
>
>Oops.
>
>The obvious thing to do is to set the P_SYSTEM flag while inside
>nfssvc(), to prevent the process from ever being swapped.  However,
>this has some implications:
>
>a) P_SYSTEM currently prevents signals from being delivered to a
>   process, which would prevent nfsiod from ever being killed.  I
>   suggest removing these semantics, and instead having the pagedaemon
>   and swapper processes set their signal masks to block all signals.
>
>b) At least one port does some optimizations when switching into a
>   P_SYSTEM process, to avoid the overhead of switching all the user
>   state when switching kernel `threads'.  This means that when we
>   turn *off* P_SYSTEM (in preparation for exiting nfssvc()), we need
>   a macro/function to sync the user state.
>
>c) This would also prevent the user page table pages from ever being
>   destroyed.  This is worsened by the fact that nfsiod is now
>   dynamically linked, and therefore uses more page table space by
>   default.  Perhaps we can have some way of destroying the page table
>   without swapping the u-area out.
>
>Any comments on this?  I'd like to implement this (probably without
>the refinements in part c) RFSN.


Why not create a new P_NOSWAP bit, which has the right semantics? Maybe
even setup things so that the sticky bit indicates to set P_NOSWAP on
executables.

Since we are talking about diskless systems, why don't we also add the
SunOS/Solaris semantics for sticky bit non executable files which means
no mtime accounting and not buffer-caching this files' data?
This should definitely help diskless performance...

christos