tech-kern: Re: RAW access to files

Subject: Re: RAW access to files
To: None <tech-kern@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 12/12/2001 15:34:42

> I have some exeperience with very large apps, processing much more
> than 2^32 bits of data (think a linear pass over a terabyte or so,
> updating a one-gigabyte array as it goes).  There, we found a huge
> performance win from doing reads in 1 or 2 Mbyte blocks -- the size
> of the disk buffer-- and using POSIX aio() or real threads to
> schedule read-ahead of those 2mbyte chunks.  (At the time, this
> design forced a non-BSD solution.)

Was there some reason to not use a second process?  I once got fed up
with the way dd between two disks kept each disk busy only about half
the time and built a program that had a reader process and a writer
process, with blocks passed between them via shared memory (mmap
MAP_ANON|MAP_SHARED, fork, and you have shared memory that doesn't
stick around on exit the way a SV SHM segment does).  It helped; even
between two disks on the same SCSI chain, I saw significant overlap
(based on watching the disk lights).

> Mmap() was a nonstarter; the app didnt fit in physical memory anyway,
> nevermind the linear once-only pass.

But is there a reason you couldn't mmap each megabyte and MADV_WILLNEED
the pages before processing and unmapping the previous chunk?

Or does MADV_WILLNEED not trigger pageins?

> I didnt dare try an ffs with 1meg blocks and 128k fragments and
> 2-block readahead: would that have worked?

Ooo.  You're twisted.  I love it.

I just now tried it on a 100MB filesystem in a vnd and newfs cored on
me.  I'm going to investigate further.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B