tech-kern: Re: `Large Inodes'

Subject: Re: `Large Inodes'
To: Chris G. Demetriou <cgd@netbsd.org>
From: Roger Brooks <R.S.Brooks@liverpool.ac.uk>
List: tech-kern
Date: 03/26/1999 23:20:05
On 26 Mar 1999, Chris G. Demetriou wrote:

>Have you demonstrated a performance problem with trying to keep it
>elsewhere?  (I mean, has your team which is doing the work tried it
>other ways, e.g. keeping a separate "ifile"-like entity for example,
>and shown that they result in unreasonable performance?)
>
>"If not, why are you optimizing it?!"
>
>
>Naively, in thinking about the problems inherent in handling large
>data sets/files and data migration to/from tape, i'd think that an
>extra local file lookup ... would not be your performance bottleneck.

No, I'm afraid this isn't so.  The point about a filestore with automatic
data migration is that within any directory, some files are in the
upper layer while others are in the lower layer (in this case on tape).
But when you type 'ls -l' you see all the information about the files in
both layers (without having to drag anything off the tape).  Only when you
try to read (or modify) a file which has migrated to tape do you need to
retrieve it.  This really means that all the inodes have to live in the
upper layer all the time.

The first computer I worked on (about 22 years ago) was an ICL 1906S
running GEORGE 4.  This had a data migration filestore which was very
advanced for its time, considering that it used 1/2" tapes which had
to be loaded by operators.  The tape layer was also the filesystem
backup, which consisted of rolling increments (with all the tapes
duplicated).  Once a file had been backed up in a increment it became
a candidate to be thrown offline (depending on size and free space).
But AFAIR, with GEORGE you could even do the equivalent of a chmod on
an offline file without retrieving it from tape because the equivalent
of the inode was always on disk.

I presume the design which Jason and Bill are working on is similar,
and the problem is if you don't put the tape address of the data for
offline files in the upper layer inode, where do you put it?  If the
lower layer lives totally on tape, you can't put it there.  I suppose
you could have a special file in the root directory of the FFS which
contains the tape addresses of offline files, indexed by inode.  The
disk quota file is a kind of a precendent for this sort of thing.

I can see one possible snag with the opaque data in the inode.  How
many copies of each file to you keep on tape?  If there's only one,
do you have a separate backup system?  But if you have a system like
GEORGE, where the tape library is also the backup you may need to
keep track of the locations of many on-tape copies of the same file.
AFAIR, GEORGE's rolling increments backed up anything which was
online and hadn't been backed up within some time window.  This meant
there was often a choice of tapes from which to retrieve the file.
A background tape-to-tape job compacted old dump tapes, discarding
the data belonging to files which had been deleted or modified.
Storing the tape data addresses in the opaque data area of the inode
does seem to limit the possibility of having a variable number of
copies of the data on tape.

Interesting question: when the data migrates to/from tape, should
the ctime change?  After all, you haven't changed any of the other
fields in the data you would get back from stat(2).  



Roger

------------------------------------------------------------------------------
Roger Brooks (Systems Programmer),          |  Email: R.S.Brooks@liv.ac.uk
Computing Services Dept,                    |  Tel:   +44 151 794 4441
The University of Liverpool,                |  Fax:   +44 151 794 4442
PO Box 147, Liverpool L69 3BX, UK           | 
------------------------------------------------------------------------------