DragonFly kernel List (threaded) for 2012-07
Re: Quota on tmpfs
On Tue, Jul 17, 2012 at 08:15:20AM -0400, Thor Lancelot Simon wrote:
> In the case of the sparse file, the user has explicitly taken actions
> that -- on normal Unix systems and filesystems -- reduce the space
> required to store the file. If I open a file and lseek() 1TB off
> the end, I have a reasonable expectation to be charged for zero bytes
> of storage, or perhaps the size of the inode -- not 1,000,000,000,000
> bytes of storage.
The DragonFly VFS quota project was originally an existing Google Summer
of Code proposal from 2010. I clearly remember some discussions about
sparse files, and a preference beeing made about counting the seek size
and not the number of actual blocks used.
> However, it is not the case AFAICT that opening a file and seeking
> 1TB off the end causes 1TB of allocation in HAMMER. Nor would I expect
> the HAMMER maintainers to think such a behavior was desirable; as far
> as I can tell they have more sense than that.
HAMMER behaves in the same way as UFS, nothing changes here.
> > Having a quota system based on visible file sizes gives at least consistent
> > results with what a regular user sees when listing files or using du(1).
> You can say that because you avoid mentioning stat(2) or stat(1) or
> (at least, not explicitly) ls(1), all of which do actually expose the
> difference between the user's requested file length (st_size) and the
> block allocations performed on behalf of the user (st_blocks * st_blksize).
> The problem is that you're mixing up apples and oranges: what the filesystem
> (HAMMER) or storage device (deduplication) do behind the user's back which
> may reduce or increase actual block usage on the underlying storage device
> are fundamentally different from what the user expressly requests the
> system do to manage block allocation (intentionally creating holes in files).
> Creating an inconsistency between what stat(2) reports and what is charged
> against the user's quota really seems like a very bad idea. I understand
> that you are trying to simplify away what looks to you like annoying
> complexity, but consider the famous Einstein quote: "as simple as possible,
> but no simpler". You've gone too simple: your scheme breaks user and
> application expectations with regard to behavior the user/application
> expressly requested from the kernel. Not a good thing.
> Existing applications reasonably expect that regardless of how much
> disk space is available, they can lseek off the end of an existing
> file and not get back an error. In fact, EDQUOT is not among the
> documented error values for lseek(2) so applications will not
> handle it (for the record, lseek also cannot return EFBIG nor ENOSPC).
> So you can be pretty sure you will break a good number of existing
> Unix applications, likely in data-corrupting ways!
As far as I remember, potential application breakages concerns didn't come
up when the decision was made to not specially handle sparse files.
I may have to it if the first implementation really causes problems in
> Again, I am very curious whether you really have consensus from the
> other Dragonfly developers in favor of this choice.
There was no consensus, but no strong opposition either.
Adding kernel@ to the discussion.