Subject: Re: UBC status
To: Chuck Silvers <chuq@chuq.com>
From: Neil A. Carson <neil@causality.com>
List: tech-kern
Date: 09/25/1999 12:15:09
Chuck Silvers wrote:

> yea, I'm not very excited about a limit on cached file data either,
> but many people have talked about such a thing so I listed it tentatively.
> I was including limiting dirty pages under "pagedaemon optimizations"...
> could you elaborate on the extremely clever ways this could be avoided?

What is the current page out algorithm? Dirty data is an interesting
tradeoff. You basically have the following considerations:
	- For user interaction, lots of dirty data can
	  make the system feel snappy
	- For soft updates, a reasonable amount of dirty
	  data can allow the system to eliminate more I/O
	- Too much dirty data causes saturation of the
	  I/O devices

Take for example, Linux, which according to my observations likes dirty
data. The result is that I can untar a 40MB archive 'just like that' on
my Linux PC, I mean instantly (wthin a second or so!). The disc light
doesn't flash... But then a few seconds later, the disc light goes on,
and stays on for several seconds. This is bad because while the data is
going out, it makes more pages hard to fetch in (especially as in Linux
the dirty data can displace VM stuff) so my machine grinds to a halt
while the machine flushes. Don't even try this under programmed IO :-)

On FreeBSD and NetBSD PCs, you don't see this happening. On NetBSD I
normally have my buffer cache multiplied up in size much more than the
default. This again seems to have undesirable effects in allowing too
much dirty data to accumulate (and allowing it to flush out read-only
pages from the buff's) which the sysem then blows out in one go. Ding
dong, back to the Linux situation.

FreeBSD works around this by having a small limit on the amount of dirty
data despite allowing the cache to grow. This works very well in
practice, althoughg I don't really believe this to be the solution
either, since all the buffer cache junk in there still has the 'blow out
in one go' problem (although by default you don't notice it).

I think the real rules you need to play by would be something like:
	- Always keep the IO subsystem active as regards spooling
	  dirty data.
	- Implement an IO prioritisation scheme (with some
	  heuristics based on drive head location etc) which places
	  interactive operations over trickle page-outs
	- If the amount of dirty data starts to accumulate too
	  much (ie the IO subsystems are continually saturated)
	  then stop it growing further.

In this way, I guess, you effectively have an 'adaptive limit' on the
amount of dirty data.

Does this make sense?

	Neil