Subject: Re: kern/36169: 1sec+ delays using msync(2) with flags MS_ASYNC | MS_INVALIDATE
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Chuck Silvers <chuq@chuq.com>
List: netbsd-bugs
Date: 09/05/2007 16:05:06
The following reply was made to PR kern/36169; it has been noted by GNATS.

From: Chuck Silvers <chuq@chuq.com>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, bsd@cs.ubc.ca
Subject: Re: kern/36169: 1sec+ delays using msync(2) with flags MS_ASYNC | MS_INVALIDATE
Date: Wed, 5 Sep 2007 08:40:32 -0700

 On Wed, Aug 29, 2007 at 05:10:12PM +0000, Brian de Alwis wrote:
 >  I've looked into this some more, and it appears the major slowdowns
 >  I reported previously result from the use of MS_INVALIDATE.
 >  
 >  crm uses mmap to map in two 12M files and does random seeking within
 >  them (the files are hash tables).  As there can be a possibility
 >  that these files are used by multiple processes, they msync() the
 >  file with MS_INVALIDATE.
 > 
 >  However in practice, there is only one process mmaping a file at
 >  a time.  But using MS_INVALIDATE on NetBSD seems to wipe out any
 >  caching of the file, regardless of whether it's actually necessary.
 
 I looked into this a bit.  there are a couple goofy things going on.
 
 this application seems to have been developed on linux,
 and on linux msync() without MS_SYNC doesn't do anything at all,
 which is why this slowness isn't seen there.
 
 the application does msync() with MS_ASYNC|MS_INVALIDATE, and that
 isn't useful for cross-host synchronization since it's asynchronous.
 there's some other code in various places that I would guess has
 some of the effect that the msync() was trying to achieve:
 
   //    Because mmap/munmap doesn't set atime, nor set the "modified"
   //    flag, some network filesystems will fail to mark the file as
   //    modified and so their cacheing will make a mistake.
   //
   //    The fix is to do a trivial read/write on the .css ile, to force
   //    the filesystem to repropagate it's caches.
 
 
 but since linux ignores the msync() calls that this application makes,
 I imagine that this application running on linux will still corrupt its
 files if it accesses them via NFS, since the actual changes it makes
 probably won't be pushed back to the NFS server until some time later.
 perhaps some other operation (such as closing the file) is implicitly
 pushing the changes to the server, but at any rate, the msync()
 isn't really doing anything towards maintaining cache coherency.
 
 so it seems to me that the best way to fix the slowness here is
 to change the msync() calls to use only MS_SYNC, or to remove them
 entirely if you don't care about accessing the data files via NFS.
 
 
 I don't think it would be a good idea to change the netbsd behaviour
 of msync() with MS_INVALIDATE.  one can never be certain of what the
 backing storage actually contains, so invalidating all pages in the range,
 dirty or not, leads to more consistent behaviour, and it's what other
 operating systems do (ones that do anything at all, that is).
 
 your proposed change (using PGO_DEACTIVATE instead of PGO_FREE) would
 never invalidate any cached data, only cause cached pages to be reused
 sooner by the VM system, which isn't what's needed.
 
 -Chuck