Subject: Re: page reuse while accessing files sequentially
To: Bill Studenmund <wrstuden@netbsd.org>
From: Frank Kardel <Frank.Kardel@Acrys.COM>
List: tech-kern
Date: 06/11/2004 08:29:02
Bill Studenmund wrote:

>On Fri, Jun 04, 2004 at 10:07:41AM +0200, Frank Kardel wrote:
>  
>
>>Would it make sense to categorize file pages with respect to their usage 
>>as sequential
>>access/random access ? Thus the file page sub class could be further 
>>divided in
>>sequential pages and random access pages.
>>    
>>
>
>Probably.
>
>  
>
>>A file(vnode) would start out as sequential access vnode until a 
>>decision can be made, that
>>access is random. Indications could be:
>> - the vnode is mmapped
>>    
>>
>
>Note that with UBC, ALL file reads & writes are done via vnode mappings.
>  
>
Thats what i though was happening. Thus you could take the vnode op to 
decide which case
(random / sequential) you have. Once you see out of order accesses or an 
mmap() with respective
mmap() flags you can assume random access. After looking at the 
madvise() manual entry it
appears that all classification informition would be available in the 
madvise() data. Thus accesses
via the read/write vnode abstraction should infer similar information 
from the offset usage
pattern.

>  
>
>> - non sequential access
>>Pages allocated could be marked with the sequential/random information 
>>of the vnode they
>>belong to. The sequential page pool could be limited separatly to keep 
>>it from monopolizing
>>the file page pool. Even the executables (mmap property) could be 
>>handled that way without
>>the need for an executable page class.
>>    
>>
>
>I think a better thing to do than have different pools is to have 
>"sequential" pages recycled differently than "random" ones.
>
This is where I started. The notion of subclassing file pages just came 
from a recent optimisation where
someone optimzized list traversal code looking for three different sub 
sets into the different lists. But before
optimzing the implementation characterizations for "random" /  
"sequential" and apprioriate reuse strategies
need to be found (if their usefulness can be made plausible).

> "Random" of 
>course really means "I may need this again," while "sequential" means "I 
>really really don't think I'll need this again."
>  
>
Yes, this should be inferred on the vnode level rather than on the 
process level (defining the entity "I"
as being the vnode).

>"Sequential" ones should get flushed sooner than later, and once flushed, 
>should get set for re-use sooner. I'm not fully sure the best way to do 
>that. :-)
>  
>
Well, a better way to cope with large data streams such as fs-copies and 
file to device copies would already
be an improvement 8-).

>  
>
>>A related strategy seems the exist in Solaris 8, where sequential access 
>>leads to a FIFO
>>replacement strategy. This might be dangerous though, as we did have 
>>once a state where
>>our java vm was seemingly on the FIFO replacement strategy making work 
>>very slow :-( .
>>This effect should be avoided.
>>    
>>
>
>Take care,
>
>Bill
>  
>