[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Kernel VS application file caching
On Jan 21, 2010, at 4:16 PM, Sad Clouds wrote:
> On Thursday 21 January 2010 14:52:04 Steven Bellovin wrote:
>> On Jan 21, 2010, at 9:25 AM, Sad Clouds wrote:
>>> As far as I know Unix kernels will transparently cache files into any
>>> available memory to speed up future I/O on those files.
>>> For applications like Internet servers, which serve many static files
>>> from disk, is there any point in implementing file caching at application
>>> level? It seems like you would end up with 2 copies of the same data -
>>> one copy cached by kernel, another copy cached by application.
>> To avoid kernel-to-userland copies?
>> What is your real performance limit? CPU? RAM? I/O bandwidth? Network
> Well the idea is to keep frequently accessed data in RAM.
Let me repeat my question slightly differently: why do you think that will help
Let me give a real-world example. Between a NetBSD laptop and a NetBSD
desktop, connected via gigE, I can run 'ttcp -s' at (if I recall correctly)
700M bps. I can upload data to my office at ~2.5M bps; I can download at about
13M bps. In other words, when I go in or out of my house, the network is by
far the limiting performance factor. For anything but a floppy drive, it
doesn't really matter how fast my disk is or how much caching happens; I can't
ship data faster than the network.
On the other hand, I have a machine in a colo with several hundred Mbps links
to the outside. On that machine, file system performance might matter.
In the abstract, you're quite correct that caching strategies matter. The
kernel does caching because of those applications. But for specific workloads
-- like sending files to the Internet -- the bottleneck might be something
else, like your link or the round-trip time to your clients.
It's always very sound advice to build, measure, optimize. Study after study
has shown that programmer guesses about what needs optimizing are almost always
My advice is to build your application, but modularize it in such a way that
you can easily plug in an application-level cache if measurements show that
that's the problem.
--Steve Bellovin, http://www.cs.columbia.edu/~smb
Main Index |
Thread Index |