Re: mlock() issues

On Fri, 22 Oct 2010 15:53:04 -0400
Matthew Mondor <> wrote:

> Anyway, I like this kind of discussion and have nothing against NIH
> personally (it fuels variety and competition, in fact), so thanks for
> sharing your custom cache experiments and performance numbers.  If you
> happen to do achieve interesting performance along the above
> lines with mmap(2) as well, I'd also like to know how it went.
Hi, the application cache I've developed is using anonymous memory
mappings. It defines an abstract data type mpb_t, this is multi-page
buffer. The cache uses 20 different buffer pages (page sizes increase
in powers of 2, from 512B to 256M), in order to provide memory segments
for multi-page buffer.

For example mpb_t object of size 1.2K would be allocated 1K and 512B
buffer pages. An mpb_t object of size 1.8K would be allocated a single
2K buffer page.

I ran some benchmarks to compare NetBSD kernel file cache and the
application cache I've developed. This was run on dual Pentium 3
1.13GHz, with 2G of RAM.

Kernel file cache test:
uint64_t time1, time2;
void *buffer = malloc(8M);

time1 = get current time;
for each file under /usr/src
        open file;
        read file into buffer;
        close file
time2 = get current time;
print time2 - time1;

Application cache test:
uint64_t time1, time2;

for each file under /usr/src
        load file into application cache;

time1 = get current time;
for each file in application cache
        fd = open("dev/null", ...);
        write(fd, cache_buffer, ...);
time2 = get current time;
print time2 - time1;

In order to be fair, I kept the number of open/close system calls in
each test loop the same. Kernel file cache test was run about 4 times,
to make sure all files under /usr/src were loaded into cache, and then
the lowest time difference was taken.

The results are:

Kernel file cache time difference - 15253 msec.
Application cache time difference - 2784 msec.

Copying data from application cache was about 5.5 times faster. On
Solaris (default installation, i.e. no tuning) the time difference
for kernel file cache test was so huge, I didn't even bother writing the

