tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Quota on tmpfs



On Tue, Jul 17, 2012 at 08:15:20AM -0400, Thor Lancelot Simon wrote:
> 
> In the case of the sparse file, the user has explicitly taken actions
> that -- on normal Unix systems and filesystems -- reduce the space
> required to store the file.  If I open a file and lseek() 1TB off
> the end, I have a reasonable expectation to be charged for zero bytes
> of storage, or perhaps the size of the inode -- not 1,000,000,000,000
> bytes of storage.

The DragonFly VFS quota project was originally an existing Google Summer
of Code proposal from 2010. I clearly remember some discussions about
sparse files, and a preference beeing made about counting the seek size
and not the number of actual blocks used.

> However, it is not the case AFAICT that opening a file and seeking
> 1TB off the end causes 1TB of allocation in HAMMER.  Nor would I expect
> the HAMMER maintainers to think such a behavior was desirable; as far
> as I can tell they have more sense than that.

HAMMER behaves in the same way as UFS, nothing changes here.

> > Having a quota system based on visible file sizes gives at least consistent
> > results with what a regular user sees when listing files or using du(1).
> 
> You can say that because you avoid mentioning stat(2) or stat(1) or
> (at least, not explicitly) ls(1), all of which do actually expose the
> difference between the user's requested file length (st_size) and the
> block allocations performed on behalf of the user (st_blocks * st_blksize).
> 
> The problem is that you're mixing up apples and oranges: what the filesystem
> (HAMMER) or storage device (deduplication) do behind the user's back which
> may reduce or increase actual block usage on the underlying storage device
> are fundamentally different from what the user expressly requests the
> system do to manage block allocation (intentionally creating holes in files).
> 
> Creating an inconsistency between what stat(2) reports and what is charged
> against the user's quota really seems like a very bad idea.  I understand
> that you are trying to simplify away what looks to you like annoying
> complexity, but consider the famous Einstein quote: "as simple as possible,
> but no simpler".  You've gone too simple: your scheme breaks user and
> application expectations with regard to behavior the user/application
> expressly requested from the kernel.  Not a good thing.
> 
> Existing applications reasonably expect that regardless of how much
> disk space is available, they can lseek off the end of an existing
> file and not get back an error.  In fact, EDQUOT is not among the
> documented error values for lseek(2) so applications will not
> handle it (for the record, lseek also cannot return EFBIG nor ENOSPC).
> So you can be pretty sure you will break a good number of existing
> Unix applications, likely in data-corrupting ways! 

As far as I remember, potential application breakages concerns didn't come
up when the decision was made to not specially handle sparse files.

I may have to it if the first implementation really causes problems in
practice.

> Again, I am very curious whether you really have consensus from the
> other Dragonfly developers in favor of this choice.

There was no consensus, but no strong opposition either.

Adding kernel@ to the discussion.

-- 
Francois Tigeot


Home | Main Index | Thread Index | Old Index