Subject: Re: new mremap(2): relax alignment restrictions?
To: Antti Kantee <pooka@cs.hut.fi>
From: Eric Haszlakiewicz <erh@nimenees.com>
List: tech-kern
Date: 07/30/2007 11:46:19
On Mon, Jul 30, 2007 at 06:39:18PM +0300, Antti Kantee wrote:
> On Mon Jul 30 2007 at 10:18:19 -0500, Eric Haszlakiewicz wrote:
> > On Mon, Jul 30, 2007 at 11:14:33AM +0200, Joerg Sonnenberger wrote:
> > > On Tue, Jul 24, 2007 at 05:15:31PM -0500, Eric Haszlakiewicz wrote:
> > > >  if you map a 10 byte file (for example), the mmap man page says that
> > > > the mapped region may be extended up to the page size.
> > > >   I suppose there's a bit of a performance hit with zeroing out the page
> > > > every time, but it seems like a necessary thing to do.  A couple other
> > > > OSes I've tried it on do so.
> > > 
> > > This discussion is lame. There are four possible approaches to this
> > > problem:
> > > (1) Discontinue UCB. Inacceptible as it introduces far more issues than
> > > it is worth.
> > > (2) Zero the page at random times (e.g. scheduling). Doesn't fix the
> > > issue, just makes the window smaller.
> > > (3) Trap on write. Horrible performance, MP issues.
> > > (4) Disallow mmap for write on not page-sized files. Standard compliance
> > > issues.
> > 
> > 5) Zero the page at mmap time.  Fixes all cases except two concurrently
> > running processes reading and writing to the area beyond the end of the
> > file.  Some performance drop, but less than the other options.
> 
> Some?  If you zero the pages of e.g. /bin/ls every time you run it, I'd
> assume a little more than "some" performance drop.  And if you don't do
> it in the exec path, I would be very careful to make sure it can't be
> exploited to circumvent the zeroing protection.

jodi: ktrace ls > /dev/null
jodi: kdump | grep "CALL.*mmap" | wc -l
      11

On a recursive ls through /usr, I saw that 11 go up to 19.  Assuming i386
and 4096 byte pages, that's ~77k (4095 * 19).  During the same period, 
ls writes out ~1800k, so 77k is ~4.2%.

That could have a significant impact, but to really know how much, I think 
we'd need to try it and see.

However, for things like executables and libraries, that are mapped read-only,
we'd only have to zero the page if some other process had opened the file
in write mode since the last time we zero'd it, which would probably drop
the overhead for those to ~0.

eric