Subject: Re: RFC: VOP_LOOKUP() speedup
To: Bill Studenmund <wrstuden@netbsd.org>
From: Reinoud Zandijk <reinoud@netbsd.org>
List: tech-kern
Date: 05/06/2004 07:49:19
Dear Bill,

On Wed, May 05, 2004 at 09:12:28AM -0700, Bill Studenmund wrote:
> > So this is correct? i dont miss out a thing?
> 
> With the second cache == block cache, yes, I think that's right.

Problem with UDF is that i can't use the device's block cache for CDs/DVDs
unless i could get it to use it only as a kind of read-`cache' that *never*
just writes back at its own will if i tell it so. To implement CD-RW and
DVD-RW, UDFs has to write with a fixed packet length; with CD-R and
DVD+R/DVD-R its obvious that i can only write in one `direction' i.e.  
ordered; what happeneds with write-errors i dont even want to know ;) How 
does the rest of the filingsystems deal with this situation?

Now it would propably advisable if i could claim/release the
`minicache'-lines dynamically from KVA; problem then is offcource still
when to release them... i dont want udfs to be a memory hogger ... the
current `minicache' is about 1Mb allready allthough it could be made
smaller.

> I still don't understand why you want a number. Isn't a boolean (all here 
> or not all here) enough?

In my tests with msdosfs i stumbled on the fact that maintaning the number 
of entries in the directory wasn't that trivial esp. when ppl. start 
renaming.

> Perhaps I'm misunderstanding you, but my objection is to changing struct 
> vnode or the name cache structures. I think all you need to do is either 
> keep a flag or count in your fs's structure off the vnode.

This can work for sure with one guarantee: that i get a notification call
or a VOP_REVOKE after each namecache removal... thats the crux i think. If 
the namecache is full, it starts deleting entries with its LRU algorithm 
but doesn't notify the vnode nor the parent(s) vnode..... a new 
VOP_NAMECACHEOP() ?

> Also, if we find issues in the proposal that need tweaking, we can tweak 
> things. ;-)

like this ;) ;)

> The problem is that using the namecache means having vnodes for all of 
> these entities. So just doing an ls on a directory creates vnodes for all 
> the contents.

true indeed.... that was my main objection too. There is one pitfall though 
that might be applicable for non UDF too : when one does a `ls' and not a 
readdir(), `ls' has to get file access and owner info, file type (softlink, 
device, ...) and thus issues a VOP_LOOKUP() anyway and creates the vnodes 
if you want or not :-/

> Given the costs of reading a directory for UDF, especially while writing, 
> I think the cost of the extra vnodes is less than the cost of not having 
> them. But for all other file systems, I do not expect the cost of the 
> directory reads is worth the extra vnodes.

See above...

> The fact that we don't always create vnodes when we read a directory is a
> feature for a lot of workloads. I think we need to keep it (for everything
> other than UDF).

I came up with a less but efficient implementation: i can set a flag in a
directory inode that the inode has been created from scratch and not
read-in. A call to the still imaginary VOP_NAMECACHEOP() would then clear 
this flag and thus retracts the autorisation of the namecache lookup.

Ideas?

Take care,
Reinoud