tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] PUFFS backend allocation (round 3)



On Sat, Oct 25, 2014 at 09:48:49AM +0200, J. Hannken-Illjes wrote:
> On 25 Oct 2014, at 06:39, Emmanuel Dreyfus <manu%netbsd.org@localhost> wrote:
> 
> > Summary: when writing to a PUFFS filesystem through page cache, we do
> > not know if backend storage is really available. If it is not, cache
> > flush may get EDQUOT or ENOSPC and the process cannot terminate (it gets
> > stuck in DE state).
> > 
> > Proposed solution: detect that a write may not have backend storage, and
> > if it is the case, try to allocate the backend storage.
> > 
> > Detecting is done on two conditions:
> > - if allocated blocks is shorter than size, the file is sparse and we
> > never know if we are writing in a hole or not: in that case, always
> > write-once
> > - if writing beyond EOF
> > 
> > Allocating the backend storage is done
> > - through newly introduced PUFFS fallocate operation (unlikely to work
> > on NetBSD as the system call exists but FFS does not support it)
> > - otherwise by reading from the file and rewriting the readen data
> > 
> > The latest patch doing this:
> > http://ftp.espci.fr/shadow/manu/puffs-alloc2.patch
> > 
> > Opinions?
> 
> At the very least you should increment PUFFSVERSION.
> 
> Setting of PNODE_SPARSE looks wrong.  You should mark a VREG pnode
> as sparse until it is sure not to have holes, which is when we get
> va_bytes != VNOVAL && vp->v_size == rvap->va_bytes.


you're forgetting about indirect blocks.  there's no generic way be sure
that a file is not sparse.  here's an example (for FFS with 16k block size)
of a file that is obviously sparse even though va_bytes == va_size:

# dd if=/dev/zero of=file bs=16k count=11 oseek=2
11+0 records in
11+0 records out
180224 bytes transferred in 0.001 secs (180224000 bytes/sec)
# ls -lsh file
208K -rw-r--r--  1 root  wheel  208K Oct 25 10:39 file


but more fundamentally, since puffs code cannot prevent changes to the file
in the underlying fs (ie. changes that don't go through puffs), any
preallocation done by puffs can be undone before it does any good.
the puffs code just needs to be fixed to handle such ENOSPC/EDQUOT errors
while flushing pages without hanging.  NFS has the same fundamental issue
but I suspect its error handling is better.

also, the proposed non-fallocate code doesn't actually change anything...
populating the pages before copying in the user data won't implicitly
cause allocation of space in the underlying file, any more than populating
the pages by copying in the user data does.  you would need to flush the
dirty pages (with VOP_PUTPAGE()) to trigger that allocation, but that would
cause every write to be synchronous, and if flushing the pages failed
in this context it would likely trigger the same bug that you're trying to
work around.

-Chuck


Home | Main Index | Thread Index | Old Index