tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Kernel VS application file caching



On Friday 22 January 2010 01:48:39 Steven Bellovin wrote:

> It's always very sound advice to build, measure, optimize.  Study after
>  study has shown that programmer guesses about what needs optimizing are
>  almost always wrong.
> 
> My advice is to build your application, but modularize it in such a way
>  that you can easily plug in an application-level cache if measurements
>  show that that's the problem.

Yes I agree, but before you start building your application, you need to make 
sure it's flexible enough and takes into account various small details. So 
like you said, for some workloads, application caching might not offer any 
benefit, because the network bandwidth is very limited.

I'm developing a web server and I'm still at the design stage. I can't do any 
benchmarks yet to see where the bottlenecks might be, however I'm trying to 
take care of the known bottlenecks that developers of other web servers 
encountered.

With web servers the content can come from different places, eg. local and 
remote filesystems. I think when you have thousands of concurrent users, then 
intelligent caching could dramatically improve performance. I thought maybe 
there was a trick to interface kernel caching subsystem and tell it which 
files to cache, how much to cache and most importantly, which files not to 
cache.

So far the closet that comes to it is calling open() with O_DIRECT flag. But 
like the man page says, this flag is advisory, and sometime it may have no 
effect. I think databases use direct I/O to eliminate double buffering of the 
same data and to speed up I/O on large files.


Home | Main Index | Thread Index | Old Index