current-users: Re: heavy use of mmap() & regex ?

Subject: Re: heavy use of mmap() & regex ?
To: None <current-users@NetBSD.ORG>
From: David Jones <dej@inode.org>
List: current-users
Date: 05/01/1998 19:11:56

Tom Ivar Helbekkmo wrote:
| An example of this way of doing things is what the Cyrus IMAP system
| from CMU does on systems wihout integrated buffer caches: writes are
| done atomically, using locks, and access to the memory mapped data is
| preceded by stat() calls to check if the mtime of the mapped file has
| changed, in which case msync() is called.  No data is ever changed by
| modifying it directly in the mapped region.

Even this is hairy and frought with race conditions.  Especially if NFS
attribute caching is involved.

I once wrote a waveform viewer for an in-house simulator.  The simulator
wrote out its data in such a way that mmap() was attractive.  I wanted to
be able to view waveforms as they were being created.

The first word of the output file gives the number of data points in the file.
I read that, then seek to read in the data points.

Problem was, the first block's data was updated before the NFS attribute
cache's idea of the length of the file was updated.  So, if I seek to the
end, and that happens to be a new page, and the file "isn't long enough yet"
because of stale cache data, I went down with a segfault.

I ended up doing both a read of the first word and a stat(), working out
the minimum guaranteed size of the file, and using that.  But it's still too
gross for words.