Subject: Re: tmpfs: Internal representation of data
To: Jason Thorpe <thorpej@shagadelic.org>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 07/25/2005 10:00:12
On Sat, Jul 23, 2005 at 11:35:05AM -0700, Jason Thorpe wrote:
> >After thinking about this for a while, it seems that the best
> >way to do this is to follow a layout similar to the one used
> >for existing on-disk file systems.
> 
> Actually, laying it out like an on-disk file system is probably not  
> the best idea.

well, it depends on your goals.  I would say that one goal that a tmpfs
should share with persistent file systems is that if none of the files
in the file system are being used, that the file system should consume
only a small amount of physical memory that doesn't depend on the amount
of space actually used in the file system.

in other words, requiring some physical memory for each file that exists
in the tmpfs even when that file is not accessed is bad.


> >  That is, I need:
> >- A set of nodes that describe files.  These could be like
> >  regular inodes.
> >- A set of blocks that store file contents.  These could store
> >  directories as well (i.e., the "file" representing the directory
> >  contents.).
> 
> I would certainly avoid using malloc/free.  I would also avoid using  
> pools for file data.  That is totally unnecessary.

agreed.


> I would do something like this:
> 
> => "tnode" data structure to describe the low-level specific bits,  
> e.g. uid/gid, permissions, etc.  Also linkage to the directory tnode,  
> plus a pointer back to the vnode.  tnodes also have a name field that  
> contains the name that would correspond to the directory entry.  Use  
> can use a pool for the tnode structures, or any other auxillary  
> structures like this that provide the in-memory linkage.
> 
> => Directories are just tnodes that have a list of children.  I'm  
> hand-waving how one might handle hard links, here.  Exercise left to  
> the reader :-)
> 
> => File data -- just hang pages off the vnode.  You want to avoid  
> double-caching the data, so using an aobj would not be the best  
> idea.  Instead, maybe implement a tmpfs_getpages / tmpfs_putpages  
> that uses swap space allocated using some other method.

one way to use the aobj code without double-caching would be to transfer
the pages between the aobj and the tmpfs vnode's object in the tmpfs
getpages/putpages methods.


> => I would not bother trying to deal with paging out the directory /  
> linkage data structures to disk.  At least, not as a first step.

being able to page out all of the per-file metadata seems like an
important property of any file system, I would include that in the design
from the beginning.

-Chuck