tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Using mmap(2) in sort(1) instead of temp files



On Thursday, April 4, 2024 11:28:13 PM CEST Robert Elz wrote:
> Yes, in cases where temp files are actually needed, using mmap() is a
> very minor gain indeed - the buffering cost might be saved, but sorting
> a large file is a cpu costly endeavour (lots of comparisons, lots of times
> even with the best sorting algorithms available) so when temp files are
> needed in the first place (large input files) the saving is liklely to be
> a few ms in an operation which takes minutes of cpu time (or more).
> Not worth the bother.

I quite disagree here. mmap for the temp files with an appropriate madvise
can minimize data copies (by using the VFS cache directly) as well reduce
the cache foot print (by evicting pages once they are used up). Especially
for storage layers like NVME that can use a significant part of the
main memory bandwidth, that's important. I don't think it helps for the
original input though, especially with the associated problems of concurrent
writes or truncations.

Joerg




Home | Main Index | Thread Index | Old Index