Subject: Re: Page daemon behavior part N
To: None <smd@ebone.net, thorpej@zembu.com>
From: None <eeh@netbsd.org>
List: tech-kern
Date: 01/25/2001 20:37:14
jthorpe writes
| > We should be able to fix that by puting UBC pages directly on to the
| > inactive list as soon as current UBC operation is complete. As far
| > as I'm concerned there's no reason UBC pages should ever be `active'
| > unless they are mmapp()ed into some process' address space.
|
| I agree completely.
Uh, just checking: are we are going to give better performance to
processes which mmap in a file and then sparsely look at / touch
pages, compared to a process that uses open/lseek/read/write to
look at / modify the same file in a sparse manner?
Actually, by putting UBC-only pages immediately on the
inactive list you give read/write and advantage over
mmap().
In general, sequential I/O tends to be large and not
re-use data, while random I/O tends to be small,
scattered, and benefits from caching. If you do
large, sequential I/O you want to recycle the buffer
cache pages as soon as possible so as not to displace
more important data. If you do random I/O you want
it to remain 'cause you may get back to it.
Read and write are sequential access methods, and
can only be converted to random through lseek().
As such you want to re-cycle pages ASAP. OTOH
mmap() is a random access method. If you use
mmap() for large, sequential I/O you should also
be using madvise() to tell the kernel when you
are finished with the data.
| However, still is the case that pages could be not cleaned quickly enough.
| Maybe we need to have more aggressive cleaning of pages recently involved
| in a UBC write operation?
This is probably smart for sequential bulk file writing, probably
not so great for directory blocks that were written out, or any
kind of "too-small" block-size... imagine "process | dd of=foo bs=512"
under heavy loads leading to the same block being zorched from the
cache multiple times, while something LRU is staying in core.
If you're doing `dd of=foo bs=512' you want to purge your output
from be buffer cache ASAP so you have room to allocate a page
for the next block.
I'm kinda leery of a one-size-fits-all pageout policy: NetBSD
wants to run on vastly different systems in terms of amount of memory
and secondary storage cost, and NetBSD users are a pretty diverse
bunch in terms of system utilization.
So you prefer to compile different algorithms for different types
of machines? What about machines that have multiple roles: a workstation
that also NFS exports disks? Do you need to switch kernels depending
on whether you're at the console running X11?
Eduardo