tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: fs-independent quotas

On Wed, Oct 19, 2011 at 10:20:23PM +0000, David Holland wrote:
> On Wed, Oct 19, 2011 at 09:22:02PM +0200, Manuel Bouyer wrote:
>  > > So, a few months back we got a new improved quota format for FFS.
>  > > Unfortunately, one of the side effects of this was to sprinkle
>  > > specific knowledge of the new format through all the userlevel quota
>  > > tools and quota support logic. To be fair, this was alongside the
>  > > existing specific knowledge of the old quota format; nonetheless, it's
>  > > messy and unscalable.
>  > 
>  > of course there's been changes to the tools, as there's a new format.
> The tools ought to be format-independent.

I can't parse this, can you explain ? The tools needs to be aware of the
format to do something usefull with the data, isn't it ?

>  > > We may want to add more quota formats (e.g. the different and
>  > > incompatible new quota format FreeBSD added last year) or add quota
>  > > support to other filesystems (tempfs, perhaps v7fs) or even add other
>  > > filesystems that have or may have their own native quota handling
>  > > (zfs, Hammer, you name it). Also, my planned lfs-renovation is
>  > > currently hung up on the VFS-level quota interface, because I don't
>  > > want to rip out the existing maybe-partial support for quotas but
>  > > can't plug new code into the existing framework.
>  > 
>  > You'll have to explain this. lfs is some variant of ffs, I see no reasons
>  > why it coudln't use the new format.
> It could use whatever format it wants. To the extent it currently
> supports quotas, I think it's limited to the old-style quotas, that
> is, quota1. But there's no way to plug it in without taking the
> fs-dependent code currently in all the tools and access pathway and
> making a third or perhaps a third and fourth copy of all the logic.

that's plain wrong. If it's quota1 you can use the quota1 code in
sys/ufs/ufs (just as it would have done before quota2).

> Likewise, if I were to go add quota support to v7fs, or try to hook up
> whatever quota support zfs has, or commit Hammer and try to get
> whatever quota support *it* has working, or add ext2 quota support, or
> write a new fs with quota support, or whatever, I'd have to make still
> more copies of the logic to cope with all the different formats and
> layouts.

Of course if you have new on-disk format you need to do some conversion,
whatever "filesystem independant" format you use.
But I think you could still reuse sys/ufs/ufs/quota2_subr.c to do the
convertion from plist to some binary representation.

> This is not a good idea, not scalable, and not sensible, especially
> when a filesystem-independent (read "format-independent" if you like)
> interface is both perfectly possible and simpler.

I strongly believe the plist representation is format-independent.
It has exactly the same informations as what you propose.

>  > in fact the new format is fs-independant. 
> Yes, in the sense that one could add the format to other file systems;
> but no, in the sense that other file systems already have their own
> quota formats and we need to be able to interoperate.

You have to do some convertion, of the same level as with what you

>  > But this is just what the current propib format is ! a set of tables
>  > with key/values pair !
> That's great, that'll make the changes I need to make that much
> easier. But it doesn't seem particularly familiar relative to the code
> I've been working on.

Or maybe you don't need to change it at all.

>  > >         the quota *type*
>  > > 
>  > >    - the quota value is:
>  > >         the configured hard limit
>  > >         the configured soft limit
>  > >         the configured grace period
>  > >         the current usage
>  > >         the current grace expiry time (if any)
>  > 
>  > This is exactly the format described in quotactl(2).
> No, what's described in quotactl(2) is something about commands and
> arguments... and while there is a substructure that looks something
> like this, the fact remains that it's a *sub*structure

Yes, but you still need a way to pass commands. You didn't talk about this.

> and the schema
> is not tabular.

I don't understant what you mean here. there's a set of values associated
with an id, I can't see the difference with what your proposing.

>  > > The quota *class* is the thing the quota is imposed on; this is
>  > > currently either "user" or "group". There is no likely prospect of
>  > > additional quota classes appearing.
>  > 
>  > I don't think we should limit ourselve to these class. I could see
>  > per-host or per-hostgroup quotas for networked filesystems for example.
> I'm not limiting it to anything, but I'll believe in more quota
> classes when I see them. Per-host quotas (even if they make sense,
> which I question) aren't going to work very well with a 32-bit id, for
> example.

right, that's where a plist is a win.

> Whereas, as I pointed out before, there are filesystems in the field
> with more than two quota types.

The current format has no limitations in this area.

>  > >    class  id    type    hard    soft    usage   grace   expire
>  > >    ------------------------------------------------------------
>  > >    user   101   block   11000   10000   5072    7d      -
>  > >    user   101   file    1100    1000    280     7d      -
>  > >    gid    100   block   22000   20000   11543   14d     -
>  > >    gid    100   file    2200    2000    5072    14d     -
>  > > 
>  > > In the traditional quota implementation these four rows in the
>  > > (filesystem-independent) logical schema fit into two struct dqblks,
>  > > one holding the group data and one holding the user data. I believe
>  > > the quota2 physical representation is similar.
>  > 
>  > the on-disk format has these fields, yes. But it's not a table, it's
>  > a linked list.
> It's a table implemented as a linked list. That's not an important
> distinction from a schema perspective. Plus, in the traditional
> implementation (quota1) which I was talking about, it's a sparse
> array.

A table can also be represented with a plist (and I think it is already),
so again I don't see the need about the new format.

>  > > by each file system at the VFS layer. That is, each filesystem will be
>  > > responsible for translating between its internal, on-disk format and
>  > > the filesystem-independent logical schema. The VFS- and syscall-level
>  > > kernel code should not need to do much of anything but hand off to the
>  > > filesystem; this is more or less how things currently are and always
>  > > have been.
>  > 
>  > This is what we have now: the logical schema is a proplib-based table;
>  > and each filesystem translate it to its own format.
>  > We can provide some helper functions to assist with the transforms,
>  > this is what I started to do in quota2_subr.c. It looks ffs-specific but
>  > is really close to what you're proposing here.
> All the current code that I've seen in the userlevel tools uses
> ffs-specific data structures, either the new ones or the old ones
> depending on which format is in use. Describing that as really close
> to what I'm proposing is a pretty big stretch.

You probably didn't look closely. Yes, the userland code does a plist to
binary convertion do a structure which is identical to the quota2 structure,
but that doesn't make it ffs-specific. 
You could change its name or layout if you want, it wouldn't affect
the on-disk format. Also, it's defined in common/include/quota/quotaprop.h,
not some ffs-specific header file.

Now, there are userland tools that have to deal with the quota1 on-disk
format directly (some of these tools, such as quotacheck, is even
ffs quota1 sepcific). You can't avoid knowledge of the on-disk format here.
Maybe it could be abstracted to a library, but in my plan it would go away
eventually so I didn't put too much effort there.

>  > > The userlevel quota library is going to be completely rewritten to
>  > > provide a key/value access API to the logical schema described above.
>  > > This will be converted to quotactl calls to the kernel... and also
>  > > some other actions, such as contacting rquotad on NFS servers. There
>  > > are also some cases with the old-style quotas where the tools access
>  > > the quota files directly; some of these cases may go away, but I'm not
>  > > sure they all can.
>  > 
>  > They can't if you want to keep some level of backward-compat.
> I'm still not sure of that.

For example if you want repquota to be able to dump quotas from
a quota1 file of an unmounted filesystem (this is part of the
quota1 -> quota2 migration).

I choose to make as little change in tools behavior as possible when
using quota1, to ease the transition (I've been relying on edquota and
repquota being able to work on an unmounted filesystem in the past,
for example). We can discuss this, but it's independant from the
quota representation.

>  > > This logic and the FS-specific knowledge it
>  > > requires can and should be contained inside libquota.
>  > 
>  > No, I don't think it has its place in libquota. libquota should only use
>  > the fs-independant interface.
>  > Right now, the places where the quota files are directly accessed are:
>  > - repquota, mostly as a way to convert from quota1 to quota2 (it exports
>  >   the content of the quota file to a plist that can be feed to quotactl).
>  > - quotacheck (quota1-specific tool anyway).
>  > 
>  > this code has really no place in libquota.
> No. The userlevel tools, including repquota, should be able to read
> and write quota information using a uniform filesystem-independent
> interface. To the extent that special per-filesystem logic is needed
> above the kernel, it should be encapsulated inside libquota and not
> spread around everywhere indiscriminately.

It's not everywhere, it's in: repquota (for the convertion to
quota2 I mentioenned above, and because it was working this way before),
quotacheck and quotaon (because they have to, they're ffs quota1 specific),
and edquota (because it was working this way before).

Again, in my plan quota1 would be deprecated in the next major release, and
be removed after so I didn't see a need to do a major cleanup in this area.
If we choose to keep ffs quota1 then things may be different (but I think
in this case we would just remove quota1-specific support in edquota and
repquota along with the ability to report/edit quotas from an unmounted
And again, this is independant from the representation format actually used.

> Also, you're wrong about what does this. I'm right now looking at code
> in edquota that opens qup->qfname and writes in it.

yes, I did forget about this one. 

>  > > I'm also going to crib from FreeBSD's quota library and add libquota
>  > > calls for things like turning quotas on and off. This should make the
>  > > userlevel tools simpler, and should make life easier for any
>  > > third-party tools that want to manipulate quotas. (There aren't many,
>  > > but a few do exist.) Unfortunately, direct compat with FreeBSD's quota
>  > > library isn't feasible as theirs is not FS-independent.
>  > 
>  > I don't think there should be userland calls to turn quota on or off.
>  > We have it for quota1, but really nothing else should use it.
>  > When you turn quota on from userland, you have to also provide the current
>  > usage (like quotacheck does), and this is a FS-specific tool.
> As I explained, the filesystem-independent semantics for
> quotaon/quotaoff are only that quota enforcement is enabled or
> disabled. This is a useful thing to be able to do. We could get rid of
> it; but I see no reason to.

So it's different from what quotaon/quotaoff actually do (right now,
for ffs quota1, when quota are off, they're not enforced any more,
but also not updated any more. This is not allowed for quota2).

I'm not against the new semantic but then we need something to do
what quotaon/quotaoff actually do for ffs quota1 (you can't start
using/updating the quota data at mount time because quotacheck has not run
yet so data may be stale. And yuu can't run quotacheck before mount because
the quota file may be on the filesystem itself).

>  > > I expect the following tools to become FS-independent:
>  > > 
>  > >    quota(1)
>  > >    quot(8)
>  > >    edquota(8)
>  > 
>  > they already are.
> Not at all. Believe me, I've been hacking on edquota all day.

OK, so:
quota(1) is not using any on-disk structure any more. So please explain in
which way it's not FS-independent.
quot(8) is by nature ffs-specific (and quota-independant as it doens't care
if quota is enabled or not, or even compiled in kernel) as it collects data
from the raw device. It could be changed to get informations from the
kenrel quota system, but then it's not quot(8) anymore, it's a clone of
repquota(8). This is a major feature change.
edquota(8): it can edit ffs quota1 data from an unmounted filesystem, yes
(this is a feature I choose to keep - for now). the quota2 part (which is
used for all mounted filesystems, even thoses using quota1) is

>  > >    quotaon(8)
>  > 
>  > This one is there for quota1, but really there should be no such
>  > tool any more.  quota management should be handled internally by
>  > the filesystem at mount time.
> See above.

Then you need something else to do the job quotaon does today.

>  > >    repquota(8)
>  > >    rpc.rquotad(8)
>  > 
>  > they already are fs-independant.
> Not at all.

again, repquota(8) can read ffs quota1 for an unmounted filesystem;
for all mounted filesystems it uses the quota2 interface, which is
rpc.rquotad(8) is fs-independant.

>  > > I'm also intending to add quotadump(8) and quotarestore(8) tools to
>  > > allow backing up quota settings easily. With the traditional quota
>  > > system you can just back up the quota files (and since they're exposed
>  > > in the filesystem, this happens by default unless you explicitly
>  > > exclude them) but with in-FS quotas that no longer works and a
>  > > dump/restore method is needed. I think quotadump and quotarestore will
>  > > probably end up as hard links to edquota, but that's not entirely
>  > > clear yet.
>  > 
>  > We already have this: quotactl(8).

I should have written: repquota -x and quotactl.

> ...which seems to work using some kind of xml-based procedure call
> interface, which isn't what a sysadmin wants to deal with when they're
> trying to run a backup or migrate to new disks.

you'll have to explain this. xml has its issues, but it's easily parseable
(which is why I choose it over some binary representation. Having written
scripts to manage quotas, I know how bad our old text-based tools are).
For a migration I'm not sure the admin cares at all about the format
of the file, it would as well be a binary blob. But if he needs to look
at it, a text-based format (even if it's xml) is certainly easier
to manage.

> The intended interface is something like
>    quotadump /home > /tmp/home.quotas
>    ...
>    quotarestore /home /tmp/home.quotas

Right now you can do:
repquota -x /home /tmp/home.quotas
quotactl /home /tmp/home.quotas

no need to do something new.

>  > > I'm going to remove the current quotactl(8) as it seems to be entirely
>  > > specific to the current proplib-based interface.
>  > 
>  > One thing I had in mind with the proplib-based interface is to have an easy
>  > way to deal with quota from scripts. What do you propose to remplace the
>  > proplib-based interface ?
> What sorts of actions from scripts are you thinking of? For backups,
> that's what quotadump and quotarestore are for. For most other usages,
> including stuff like massediting 10,000 student quotas at the start of
> a semester or whatnot, edquota serves nicely.

NO. Really not. This may be OK for a one-shot run, but when you want to
write a tool that needs to read *all* quotas, do some computation on it
and change some of them what we had before quota2 is really not convenient.

>  > > Note that quotacheck(8) is specific to the old-style FFS quotas and is
>  > > not FS-independent; this will not (and cannot) change.
>  > > 
>  > > One remaining thing: I'm intending to systematize the current mess of
>  > > quotas enabled/disabled/on/off/vanilla/chocolate/strawberry as
>  > > follows:
>  > > 
>  > > 1. A file system type can have or not have support for quotas. If
>  > > there is no support for quotas, nothing else works.
>  > > 
>  > > 2. Any given filesystem volume may have or not have quota data on it.
>  > > This is the filesystem code's problem and irrelevant to the
>  > > FS-independent logic.
>  > > 
>  > > 3. Any given filesystem volume may be mounted with or without quotas
>  > > enabled. If quotas are not enabled, quota information is not available
>  > > and the quota utilities will not be able to do anything.
>  > > 
>  > > 4. Once mounted, quotas can be either on or off. As far as the
>  > > FS-independent code is concerned, quotas being off means only that
>  > > they aren't enforced; that is, with quotas off operations that
>  > > increase usage do not fail with EDQUOT. When quotas are off, quota
>  > > information can still be inspected or updated.
>  > 
>  > What is the purpose of this ?
> Which part of it? If you mean on/off, see above. If you mean one of

Yes, it was the on/off. 

> the other distinctions, I think they're more or less self-explanatory.
> If you want to know the purpose of drawing these distinctions
> carefully at all, it's because currently the semantics are unclear and
> poorly documented.

poorly documented, I agree. But they're not unclear for me.
Also, in the above I think you should make it clear that when quotas
are off, the filesystem will still update quota usage, even if not
enforcing the limits.

>  > > I am not intending to change the specific semantics that turning
>  > > quotas on has the traditional quota system. Those semantics are
>  > > required for quotacheck to be able to do its thing properly. However,
>  > > knowledge of this behavior should be limited to the code in FFS (and
>  > > probably some in libquota) that needs to know the gory details.
>  > > 
>  > > Currently there are, as far as I can tell, multiple ways to enable
>  > > quotas for a filesystem in /etc/fstab, and the quota utilities check
>  > > fstab in various (and I think not always consistent) ways to try to
>  > > figure out what's going on. My intent is to nuke all that: only mount
>  > > should care what's in /etc/fstab, because otherwise the tools won't
>  > > work properly on temporary mounts. The quota library (and thus the
>  > > tools) should detect whether a mounted filesystem has quotas enabled
>  > > by calling quotactl; if quotactl fails, quotas are not enabled. (In
>  > > the long run there should be a FS-independent mount flag to indicate
>  > > this; however, I'm not sure we're ready for that just yet.)
>  > 
>  > No, in the new world only quotacheck and quotaon checks the fstab
>  > to know where quotas should be checked/enabled and where the quota
>  > file is. These are quota1-specific. I think they should be left as-is
>  > until quota1 support is removed.
> quota1 support isn't going to be removed.

That's a change in my plans then.  Why do you think it should stay ?
This kind of quota system is not going to work for modern filesystem sizes
(quotachek takes ages).

> Anyhow, as I wrote above, the knowledge of whether quotas exist should
> be maintained and provided by the kernel, so it works reliably and
> with mounts that aren't listed in fstab. All file systems that support
> quotas can and should do this.

this is what quota2 does. quota1 is different here, and I think I explained
why. We can choose to change it, but then it is what I would
call a major behavior change and I think there should be a transition

> Also, you're once again wrong about what's using this logic. In
> addition to quotacheck and quotaon, quota, edquota, and repquota are
> all checking fstab.

no, quota is not. edquota and repquota are, I already explained why.

>  > >    #define QUOTA_DEFAULTID       ((id_t)-1)
>  > 
>  > -1 can also be a uid or gid, isn't it ?
> No, -1 is not a valid uid or gid. See for example setreuid(2).
>  > What interface do you plan between kernel and userland ? keep the
>  > proplib-based interface ?
> An encoded form of the API I already described, with get/put/delete
> and cursors.

So we loose the clear command. I guess it's implemented as part of put.

>  > All of what you propose can be fully implemented with the
>  > current proplib interface and its schema, so it looks like you're proposing
>  > to rework libquota.
> No, because (among other things) the schema I'm implementing is not
> the same. The proplib schema is hierarchical, for example,
> rather than being normalized;

I see this is an advantage, not an inconvenient. You're flattening something
that is naturally hierarchical.

> also it doesn't support cursors.

This can easily be implemented in userland, without changes to the
quotactl(2) interface. I've trouble seeing how this can be sanely
implemented at the quotactl(2) level (I don't like the idea of the
kernel keeping states about what a specific userland process is doing).

What I understant is that you mostly want a enhanced API for userland
tool. It can be implemented without changes to quotactl(2) or the kernel

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference

Home | Main Index | Thread Index | Old Index