Subject: Re: mmap (was Re: bin/10625: /usr/bin/cmp)
To: None <tech-userlevel@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: tech-userlevel
Date: 07/28/2000 17:13:22
[ On , July 28, 2000 at 14:00:47 (-0700), Wolfgang Rupprecht wrote: ]
> Subject: Re: mmap (was Re: bin/10625: /usr/bin/cmp)
>
> My reason for using mmap is because I'm hacking (up to) 600Meg files.
> There is no fscking way I'm going to run all that crap through the
> buffer cache if I don't need to.

If you are not reading or writing a significant portion of your file in
the first place then you can avoid loading any significant part of your
file into the buffer cache through careful use of lseek().  The buffer
cache only holds blocks that have been read or written to (and perhaps a
small amount of read-ahead in some implementetions but I think not in
NetBSD's current implementation).

Don't forget about the utility of readv() and writev() either (if your
code is restricted to *BSD-alikes).

> Hiding mmap() in a library will make the programmer interface look
> nicer, but if the underlying foundation doesn't work well I'm still
> left with needless inefficiencies.

Though I've not yet ever examined in detail how Sfio uses mmap() I'm
guessing it only uses it for reads and writes to existing blocks.

>  Unmapping, seeking, writing a byte
> and remapping is a silly waste of CPU.

Sfio hides all the buffering and writing behind an interface that very
much resembles stdio (so much in fact that a stdio-compatible front-end
is available almost for free).  Sfio also presumably hides the
difficulty of mmap()ing a file that's bigger than (SIZE_T_MAX -
occupied_mem) too.

>  (And I'm not even convinced
> that they buffer-cache and the mmaped data are ever guaranteed to be
> in sync.)

I think in NetBSD that it is guaranteed *not* to be in sync, at least
not until the ongoing unified buffer cache work is merged in.

I'm not even sure if it's safe to use Sfio with the mmap feature turned
on with NetBSD....

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>