tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Kernel VS application file caching



> As far as I know Unix kernels will transparently cache files into any
> available memory to speed up future I/O on those files.

Depends on the kernel.  Traditionally, BSD had a relatively small
buffer cache that was used for filesystem data (which includes both
file contents and metadata).  Ever since UBC came in, NetBSD has at
least partially eliminated that.  My impression is that the current
scheme uses something like the traditional buffer cache for metadata
but grabs whatever memory is available for file contents, but I'm sure
someone who actually understands it can correct me if I'm wrong.

> For applications like Internet servers, which serve many static files
> from disk,

Well, _some_ Internet servers, such as webservers.  Not all; ssh
servers, for exmaple, generally do not do anything of the sort.

> is there any point in implementing file caching at application level?

Yes.  You said it yourself, in part: it permits you to make cache
decisions based on knowledge not available below the application level,
such as which files are more worth retention.

> It seems like you would end up with 2 copies of the same data - one
> copy cached by kernel, another copy cached by application.

That can happen.  You can reduce the problem by use of things like
mmap() rather than read() for reading files, especially when coupled
with things like Hubert's suggestion to look at madvise().

Also, you can fiddle with tunables.  NetBSD, for example, has a handful
of sysctls like vm.filemin and vm.filemax to control low and high water
marks on the use of memory to cache file data.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index