Subject: Re: Problems with 1.2/i386
To: Michael L. VanLoon -- HeadCandy.com <michaelv@MindBender.serv.net>
From: Curt Sampson <curt@portal.ca>
List: port-i386
Date: 10/03/1996 10:05:14
On Wed, 2 Oct 1996, Michael L. VanLoon -- HeadCandy.com wrote:

> >    Yeah, I guess disk cache.  A friend of mine was bragging about Linux
> >and how it uses all (or at least) most available space for cache.  He
> >said that it also accounts for a large part of the performance
> >difference between OS/2 and WinNT.  I was just curious as to how NetBSD
> >does it.
> 
> Which performance difference is this?  Was OS/2 supposed to faster
> than NT, or slower?  Which magazine did he read? :-)  Either way it's
> probably marketing hype, one direction or the other.

The big performance problem with Windows NT that I'm aware of is
nothing to do with the buffer cache; it's the microkernel architecture.
It has to do four, rather than just two, relatively expensive
protection mode changes for every system call, because the calls
are in userland. For more information on this see last February's
_ACM Transactions on Computing_.

> It depends on what you're doing if [the merged buffer cache] will be a
> big win for you.

Actually, from my impression of things (I'm no VM expert), in some
cases it can be a loss. I'm not familiar with the Linux implementation
at this time, but I have read up on the System V impelementation.
SysV maintains a large area of memory in kernel VM space for storage
of file blocks. (Essentially, it seems to move the normal buffer
cache into VM.) When you ask for a block from a file, it looks to
see if that block was already mapped into that VM space. If not,
it creates a mapping for that block, and marks it as backed by the
block from the file on disk. When you ask for it, if the block is
not in memory it pages it in from the file, otherwise it just gives
it to you.

So say you have a program with a lot of pages in memory which are
accessed on a moderately frequent basis. And lets say another
program decides to make a single sequential scan through a large
file (at least as large as physical memory). Now guess whose pages
get dumped to backing store.

(Normally, in the course of VM systems, this sort of thing is fixed
with a system call like madvise(), where you tell the OS that you
will be doing sequetial instead of random accesses, it and it knows
that it can chuck out pages after you've gone past them. But there's
no equivalant in stdio or even in the file-related system calls,
and even if there were, there's no standard for it, so it would
probably be used only by system utilities.)

In very disk intensive environments where a lot of different files
are getting touched infrequently, this sort of sharing can be a
loss. I once talked to a fellow who was running a machine doing a
lot of news, mail and UUCP stuff, and he said that when he upgraded
the OS and got shared VM/buffer cache, performance went down the
tubes.

You can help this sort of thing by giving priority to program pages;
i.e., if you have to move a page to backing store, you always (down
to a certain minimum point) move a file buffer page before any
other kind of page. This would solve the problem, but at the expense
of giving you the advantage of extra buffer cache space only when
you're not filling or nearly filling physical memory with programs
and their data. On my machines, that's not a terribly frequent
occurance. (I've only got 64MB in my workstation, but how many
people have much more than that?)

So in short, while I still feel that merged VM/buffer cache is a
nice thing, I'd mainly love to see it so that files opened both
with write() and mmap() would stay synchronised. In terms of
performance improvement, it's much lower on my priority list than
things like LFS (which would be a really big win on my news server,
since it writes about 120,000 articles per day and reads back only
20,000 of them).

(Of course, I could be entirely wrong in my VM explanations here,
in which case I am sure I am about to get clobbered by an expert. :-))

cjs

Curt Sampson    curt@portal.ca		Info at http://www.portal.ca/
Internet Portal Services, Inc.	
Vancouver, BC   (604) 257-9400		De gustibus, aut bene aut nihil.