Subject: Re: Y2038, was as long as we're hitting FFS...
To: Ted Lemon <mellon@isc.org>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 03/25/1999 14:27:37
On Thu, 25 Mar 1999, Ted Lemon wrote:

> > What I think I'm still arguing is that: we were proposing a large-inode
> > variant of the current ffs implimentation (more accuratly of the ufs
> > implimentation), NOT a whole new fs, and that we are providing a
> > framework for a lot of new stuff. As not everyting can be decided now, we
> > left room for future growth.
> 
> Okay.   It seems like there are three classes of objections generally:
> 
> 1. The opaque data area is inherently application-specific because
>    there's no mechanism for sharing it, and also because it's arguably
>    too small to share.

And that it was not designed for what a lot of people are wanting it to do
- be a kitchen sink respository. It was intended mainly to store state for
a stacked fs, not be a resource manager. Also, performance is a BIG issue
for what we want to do. We are looking at 7x24 systems with Terrabytes of
storage and thousands of users, probably acting as NFS servers. The
thought of handling variable-length data in this environment concerns me.

I think it's fine to extend the interface a bit. Right now we have test,
get, set, and clear operations on the metadata. It seems easy to me to
extend them to take a magic number value, with (0) being the catch-all. So
then you can deal with different types of data, and even add an overlay fs
to store multiple types at once.

"Application" might not have been the best term to use. It's not
application as in a program, but application as in a file server of a
specific type.

> 2. The new API for accessing the opaque data area seems unnecessary.

Granted. Though we would like to keep things easy to export to other
systems. Does the fcntl call in general have the functionality we need
(pass a file descriptor, a command which encodes in/out access, and a void
*) on other/most platforms?

> 3. There are other things we ought to throw into the inode while we're
>    at it.

That's fine. But since it'll take time to hash all these things out, we
decided to go with the flag field to say what is there. That way not
everything has to be decided at once.

> 4. Maybe we ought to store the filesystem metadata in network byte
>    order so as to enhance plug-and-play.

My vote here is to just make FFS_EI not an option. :-) It already has code
everywhere the swapping would need to be. :-) Also, it avoids the argument
as to which byte order to choose. :-)

> I proposed a couple of ideas to deal with (1) in a different message,
> as has Julian.   (Of course, I like *my* ideas better... :')   It
> sounded like you'd come up with a way to resolve (2) with fcntl -
> true?

(about fcntl) Probably. Portability would be the only concern, and it's
secondary.

About (1), most of the proposals have solved a different problem than the
one we have in mind. Actually, given that the vnextops or fcntl command
space would have at least 16k worth of generic commands, we could
certainly later on add commands which deal with tag/length data. :-)

> Some ideas have been floated for (3) as well.   Unfortunately, (4) is
> more of a judgement call than anything else, and therefore looks like
> something that either comes down to a vote or a fiat.

Take care,

Bill