tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: fs-independent quotas

On Wed, Oct 19, 2011 at 09:22:02PM +0200, Manuel Bouyer wrote:
 > > So, a few months back we got a new improved quota format for FFS.
 > > Unfortunately, one of the side effects of this was to sprinkle
 > > specific knowledge of the new format through all the userlevel quota
 > > tools and quota support logic. To be fair, this was alongside the
 > > existing specific knowledge of the old quota format; nonetheless, it's
 > > messy and unscalable.
 > of course there's been changes to the tools, as there's a new format.

The tools ought to be format-independent.

 > > We may want to add more quota formats (e.g. the different and
 > > incompatible new quota format FreeBSD added last year) or add quota
 > > support to other filesystems (tempfs, perhaps v7fs) or even add other
 > > filesystems that have or may have their own native quota handling
 > > (zfs, Hammer, you name it). Also, my planned lfs-renovation is
 > > currently hung up on the VFS-level quota interface, because I don't
 > > want to rip out the existing maybe-partial support for quotas but
 > > can't plug new code into the existing framework.
 > You'll have to explain this. lfs is some variant of ffs, I see no reasons
 > why it coudln't use the new format.

It could use whatever format it wants. To the extent it currently
supports quotas, I think it's limited to the old-style quotas, that
is, quota1. But there's no way to plug it in without taking the
fs-dependent code currently in all the tools and access pathway and
making a third or perhaps a third and fourth copy of all the logic.

Likewise, if I were to go add quota support to v7fs, or try to hook up
whatever quota support zfs has, or commit Hammer and try to get
whatever quota support *it* has working, or add ext2 quota support, or
write a new fs with quota support, or whatever, I'd have to make still
more copies of the logic to cope with all the different formats and

This is not a good idea, not scalable, and not sensible, especially
when a filesystem-independent (read "format-independent" if you like)
interface is both perfectly possible and simpler.

 > in fact the new format is fs-independant. 

Yes, in the sense that one could add the format to other file systems;
but no, in the sense that other file systems already have their own
quota formats and we need to be able to interoperate.

 > But this is just what the current propib format is ! a set of tables
 > with key/values pair !

That's great, that'll make the changes I need to make that much
easier. But it doesn't seem particularly familiar relative to the code
I've been working on.

 > >    - the quota key is:
 > >         the quota *class*
 > >         the id
 > Don't forget we now have a new id: "default"

Yes, there's a reserved value for it.

 > >         the quota *type*
 > > 
 > >    - the quota value is:
 > >         the configured hard limit
 > >         the configured soft limit
 > >         the configured grace period
 > >         the current usage
 > >         the current grace expiry time (if any)
 > This is exactly the format described in quotactl(2).

No, what's described in quotactl(2) is something about commands and
arguments... and while there is a substructure that looks something
like this, the fact remains that it's a *sub*structure and the schema
is not tabular.

 > > The quota *class* is the thing the quota is imposed on; this is
 > > currently either "user" or "group". There is no likely prospect of
 > > additional quota classes appearing.
 > I don't think we should limit ourselve to these class. I could see
 > per-host or per-hostgroup quotas for networked filesystems for example.

I'm not limiting it to anything, but I'll believe in more quota
classes when I see them. Per-host quotas (even if they make sense,
which I question) aren't going to work very well with a 32-bit id, for

Whereas, as I pointed out before, there are filesystems in the field
with more than two quota types.

 > >    class  id    type    hard    soft    usage   grace   expire
 > >    ------------------------------------------------------------
 > >    user   101   block   11000   10000   5072    7d      -
 > >    user   101   file    1100    1000    280     7d      -
 > >    gid    100   block   22000   20000   11543   14d     -
 > >    gid    100   file    2200    2000    5072    14d     -
 > > 
 > > In the traditional quota implementation these four rows in the
 > > (filesystem-independent) logical schema fit into two struct dqblks,
 > > one holding the group data and one holding the user data. I believe
 > > the quota2 physical representation is similar.
 > the on-disk format has these fields, yes. But it's not a table, it's
 > a linked list.

It's a table implemented as a linked list. That's not an important
distinction from a schema perspective. Plus, in the traditional
implementation (quota1) which I was talking about, it's a sparse

 > > The current structural plan is for this logical schema to be exposed
 > > by each file system at the VFS layer. That is, each filesystem will be
 > > responsible for translating between its internal, on-disk format and
 > > the filesystem-independent logical schema. The VFS- and syscall-level
 > > kernel code should not need to do much of anything but hand off to the
 > > filesystem; this is more or less how things currently are and always
 > > have been.
 > This is what we have now: the logical schema is a proplib-based table;
 > and each filesystem translate it to its own format.
 > We can provide some helper functions to assist with the transforms,
 > this is what I started to do in quota2_subr.c. It looks ffs-specific but
 > is really close to what you're proposing here.

All the current code that I've seen in the userlevel tools uses
ffs-specific data structures, either the new ones or the old ones
depending on which format is in use. Describing that as really close
to what I'm proposing is a pretty big stretch.

 > > The userlevel quota library is going to be completely rewritten to
 > > provide a key/value access API to the logical schema described above.
 > > This will be converted to quotactl calls to the kernel... and also
 > > some other actions, such as contacting rquotad on NFS servers. There
 > > are also some cases with the old-style quotas where the tools access
 > > the quota files directly; some of these cases may go away, but I'm not
 > > sure they all can.
 > They can't if you want to keep some level of backward-compat.

I'm still not sure of that.

 > > This logic and the FS-specific knowledge it
 > > requires can and should be contained inside libquota.
 > No, I don't think it has its place in libquota. libquota should only use
 > the fs-independant interface.
 > Right now, the places where the quota files are directly accessed are:
 > - repquota, mostly as a way to convert from quota1 to quota2 (it exports
 >   the content of the quota file to a plist that can be feed to quotactl).
 > - quotacheck (quota1-specific tool anyway).
 > this code has really no place in libquota.

No. The userlevel tools, including repquota, should be able to read
and write quota information using a uniform filesystem-independent
interface. To the extent that special per-filesystem logic is needed
above the kernel, it should be encapsulated inside libquota and not
spread around everywhere indiscriminately.

Also, you're wrong about what does this. I'm right now looking at code
in edquota that opens qup->qfname and writes in it.

 > > I'm also going to crib from FreeBSD's quota library and add libquota
 > > calls for things like turning quotas on and off. This should make the
 > > userlevel tools simpler, and should make life easier for any
 > > third-party tools that want to manipulate quotas. (There aren't many,
 > > but a few do exist.) Unfortunately, direct compat with FreeBSD's quota
 > > library isn't feasible as theirs is not FS-independent.
 > I don't think there should be userland calls to turn quota on or off.
 > We have it for quota1, but really nothing else should use it.
 > When you turn quota on from userland, you have to also provide the current
 > usage (like quotacheck does), and this is a FS-specific tool.

As I explained, the filesystem-independent semantics for
quotaon/quotaoff are only that quota enforcement is enabled or
disabled. This is a useful thing to be able to do. We could get rid of
it; but I see no reason to.

 > > I expect the following tools to become FS-independent:
 > > 
 > >    quota(1)
 > >    quot(8)
 > >    edquota(8)
 > they already are.

Not at all. Believe me, I've been hacking on edquota all day.

 > >    quotaon(8)
 > This one is there for quota1, but really there should be no such
 > tool any more.  quota management should be handled internally by
 > the filesystem at mount time.

See above.

 > >    repquota(8)
 > >    rpc.rquotad(8)
 > they already are fs-independant.

Not at all.

 > > I'm also intending to add quotadump(8) and quotarestore(8) tools to
 > > allow backing up quota settings easily. With the traditional quota
 > > system you can just back up the quota files (and since they're exposed
 > > in the filesystem, this happens by default unless you explicitly
 > > exclude them) but with in-FS quotas that no longer works and a
 > > dump/restore method is needed. I think quotadump and quotarestore will
 > > probably end up as hard links to edquota, but that's not entirely
 > > clear yet.
 > We already have this: quotactl(8).

...which seems to work using some kind of xml-based procedure call
interface, which isn't what a sysadmin wants to deal with when they're
trying to run a backup or migrate to new disks.

The intended interface is something like

   quotadump /home > /tmp/home.quotas
   quotarestore /home /tmp/home.quotas

 > > I'm going to remove the current quotactl(8) as it seems to be entirely
 > > specific to the current proplib-based interface.
 > One thing I had in mind with the proplib-based interface is to have an easy
 > way to deal with quota from scripts. What do you propose to remplace the
 > proplib-based interface ?

What sorts of actions from scripts are you thinking of? For backups,
that's what quotadump and quotarestore are for. For most other usages,
including stuff like massediting 10,000 student quotas at the start of
a semester or whatnot, edquota serves nicely.

 > > Note that quotacheck(8) is specific to the old-style FFS quotas and is
 > > not FS-independent; this will not (and cannot) change.
 > > 
 > > One remaining thing: I'm intending to systematize the current mess of
 > > quotas enabled/disabled/on/off/vanilla/chocolate/strawberry as
 > > follows:
 > > 
 > > 1. A file system type can have or not have support for quotas. If
 > > there is no support for quotas, nothing else works.
 > > 
 > > 2. Any given filesystem volume may have or not have quota data on it.
 > > This is the filesystem code's problem and irrelevant to the
 > > FS-independent logic.
 > > 
 > > 3. Any given filesystem volume may be mounted with or without quotas
 > > enabled. If quotas are not enabled, quota information is not available
 > > and the quota utilities will not be able to do anything.
 > > 
 > > 4. Once mounted, quotas can be either on or off. As far as the
 > > FS-independent code is concerned, quotas being off means only that
 > > they aren't enforced; that is, with quotas off operations that
 > > increase usage do not fail with EDQUOT. When quotas are off, quota
 > > information can still be inspected or updated.
 > What is the purpose of this ?

Which part of it? If you mean on/off, see above. If you mean one of
the other distinctions, I think they're more or less self-explanatory.
If you want to know the purpose of drawing these distinctions
carefully at all, it's because currently the semantics are unclear and
poorly documented.

 > > I am not intending to change the specific semantics that turning
 > > quotas on has the traditional quota system. Those semantics are
 > > required for quotacheck to be able to do its thing properly. However,
 > > knowledge of this behavior should be limited to the code in FFS (and
 > > probably some in libquota) that needs to know the gory details.
 > > 
 > > Currently there are, as far as I can tell, multiple ways to enable
 > > quotas for a filesystem in /etc/fstab, and the quota utilities check
 > > fstab in various (and I think not always consistent) ways to try to
 > > figure out what's going on. My intent is to nuke all that: only mount
 > > should care what's in /etc/fstab, because otherwise the tools won't
 > > work properly on temporary mounts. The quota library (and thus the
 > > tools) should detect whether a mounted filesystem has quotas enabled
 > > by calling quotactl; if quotactl fails, quotas are not enabled. (In
 > > the long run there should be a FS-independent mount flag to indicate
 > > this; however, I'm not sure we're ready for that just yet.)
 > No, in the new world only quotacheck and quotaon checks the fstab
 > to know where quotas should be checked/enabled and where the quota
 > file is. These are quota1-specific. I think they should be left as-is
 > until quota1 support is removed.

quota1 support isn't going to be removed.

Anyhow, as I wrote above, the knowledge of whether quotas exist should
be maintained and provided by the kernel, so it works reliably and
with mounts that aren't listed in fstab. All file systems that support
quotas can and should do this.

Also, you're once again wrong about what's using this logic. In
addition to quotacheck and quotaon, quota, edquota, and repquota are
all checking fstab.

 > >    #define QUOTA_DEFAULTID ((id_t)-1)
 > -1 can also be a uid or gid, isn't it ?

No, -1 is not a valid uid or gid. See for example setreuid(2).

 > What interface do you plan between kernel and userland ? keep the
 > proplib-based interface ?

An encoded form of the API I already described, with get/put/delete
and cursors.

 > All of what you propose can be fully implemented with the
 > current proplib interface and its schema, so it looks like you're proposing
 > to rework libquota.

No, because (among other things) the schema I'm implementing is not
the same. The proplib schema is hierarchical, for example, rather than
being normalized; also it doesn't support cursors.

David A. Holland

Home | Main Index | Thread Index | Old Index