tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Using mmap(2) in sort(1) instead of temp files



Why is this on tech-kern?  It seems to me it belongs on tech-userlevel.

> I'm trying to speed up sort(1) by using mmap(2) instead of temp
> files.

If you're going to sort in RAM, why bother with temporaries at all?
Just slurp it all in and sort it in core.

But.

Part of the point of using temp files, it seems to me, is to be able to
sort datasets larger than will fit memory.

Unless NetBSD is prepared to completely desupport small machines like
the MicroVAX-II or shark, I think this might be a misguided thing to
do.  Given the way swap works, I suspect it will work better to use of
temp files instead of mmap()ed memory to sort datasets larger than will
fit in RAM, even if VM is available.  Furthermore, VM can be limited;
sorting input bigger than 3G on i386 shouldn't break (and, from a
usability standpoint, shouldn't even require any special options).
Even on 64-bit, VM can be comparatively small; on a 9.1 amd64 machine
at work, proc.$$.rlimit.datasize.hard is only 8 gigs.

At the very least, I would strongly recommend adding an option to
disable this, to continue to use real files for temporaries.

> ftmp() (see code below) is called in the sort functions to create and
> return a temp file.  mkstemp() is used to create the temp file, then
> the file pointer (returned by fdopen) is returned to the sort
> functions for use.  I'm trying to understand where and how mmap
> should come into the picture here, and how to implement this feature.

I think the biggest issue you'll have (aside the ones raised above) is
that an mmap()ed memory block has a fixed size, set at map time.  Files
are sized much more dynamically.  I suspect you'll end up
(re)implementing a ramfs (a simplified one, because the application
needs are relatively simple, but still.)

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index