tech-kern: Re: swapfs filesystem design (and mount/umount question)

Subject: Re: swapfs filesystem design (and mount/umount question)
To: Simon Burge <simonb@NetBSD.ORG>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 03/20/2000 08:13:06
On Mon, Mar 20, 2000 at 03:29:37PM +1100, Simon Burge wrote:
> Chuck Silvers wrote:
> 
> > On Mon, Mar 20, 2000 at 12:18:03AM +1100, Simon Burge wrote:
> > > Folks,
> > > 
> > > Here's some rough notes on how I think a swapfs filesystem should be
> > > implemented from a layout POV.
> > > 
> > >  B- The filesystem is contained in one aobj.  This is split up into four
> > >     parts:
> > > 
> > >     1) a bitmap for each page that is used
> > >     2) a bitmap for each inode that is used
> > >     3) an page map for the pages that contain inodes (see C below).
> > >     4) the filesystem inodes and data
> > 
> > you might consider making each of these types of data a separate aobj.
> > if you put the inodes in an aobj (ie. one aobj contains all the inodes),
> > then you don't need the "page map" (if I understand what that's supposed
> > to be),
> 
> I've chopped and changed on multiple aobjs a couple of times (well, one
> for data and one for the maps - but one for each map makes more sense if
> I'm gonna split things up).  The maps themselves are relatively small
> (roughly 0.2% of the total filesystem size).  With my tinkering I've
> been using a 8192 page aobj, and the maps total just 17 pages.  Perhaps
> it would be simpler to just malloc the maps instead of setting up aobjs
> for them.  With say a gigabyte swapfs you'd then need just over two
> megabytes of wired kernel memory - is this not an unreasonable size to
> malloc for a filesystem?

that's probably reasonable, at least to start with.


> > and you don't need the "si_number" in the swapfs_inode.
> > I don't think you need the "si_dev" in each inode either, since that
> > should be the same for all inodes in the filesystem.
> 
> I think we'll still need the si_number and si_dev so that when given a
> pointer to a vnode we know where it comes from - remember that we need
> to keep info that's usually spread across both in-memory and on-disk
> structure on a "normal" filesystem in the one structure.  As I work out
> more about filesystems this may change - for example I might be able to
> fairly simply work out the inode number by the address of the inode.
> For a couple of bytes per inode it may end up easier to leave things as
> is.  Opinions from any filesystem gurus appreciated ;).

the si_dev can go in the equivalent of "struct fs", ie. the per-mount
filesystem-specific data.

as for si_number, that would go in the incore inode structure rather than
the "on-disk" version.  unless you're planning on having these be the same?

-Chuck