Subject: kern/7213: exported LFS becomes nonfunctional after cleaner runs
To: None <gnats-bugs@gnats.netbsd.org>
From: Jason R Thorpe <thorpej@dracul.nas.nasa.gov>
List: netbsd-bugs
Date: 03/22/1999 18:30:15
>Number:         7213
>Category:       kern
>Synopsis:       exported LFS becomes nonfunctional after cleaner runs
>Confidential:   yes
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Mar 22 18:35:01 1999
>Last-Modified:
>Originator:     
>Organization:
Numerical Aerospace Simulation Facility - NASA Ames
>Release:        March 18, 1999
>Environment:
	
System: NetBSD dracul 1.3K NetBSD 1.3K (DRACUL) #811: Thu Mar 18 12:21:33 PST 1999 thorpej@dracul:/u2/netbsd/src/sys/arch/i386/compile/DRACUL i386


>Description:
	I have begun using LFS now that it is a usable thing in the NetBSD
	source tree.  To minimize my risk, I've made a shared and NFS exported
	/usr/obj an LFS file system.

	For quite a while, in fact the amount of time it took the entire
	tree to build on my AlphaStation 500 (the NFS client), the NFS
	export worked fine.  However, some time after the file system
	had been idle, file operations on that file system done from
	the client, began to fail.

	The NFS-accessed-LFS file system, on the client, is /nfs/dracul/u3
	mounted via amd(8).  /usr/obj is a symlink to /nfs/dracul/u3/obj.

	cd /usr/obj -> EACCESS
	cd /usr/obj/bin/cat/obj.alpha -> EACCESS [by the second component
	   of the pathname losing]
	cd /nfs/dracul/u3 -> EIO

	My theory is that when the cleaner ran, the inode numbers for
	the files it garbage-collected changed, while the inode generation
	numbers did not change.  So, rather than causing ESTALE to be
	returned to the client, the operation was done using now-invalid
	file handle (which contains, effectively, the inode number for
	the file), which could have unpredictable, and even possibly
	destructive results.

	There is a comment above lfs_fhtovp() which indicates that
	NFS exporting of LFS needs to be looked at more carefully.
	Clearly, at the very least, the generation number confusion
	needs to be addressed.

	However, that will lead to lots of ESTALE being returned to
	the client, which may not be a good thing.

	Perhaps another solution is to have LFS not actually change the
	inode number of a file when it garbage-collects it.  However,
	considering how inode number is related to its location in the
	file system, I'm not sure how this can be achieved.  In any case,
	this will probably be required in order to make LFS useful as
	an NFS-exportable file system.

>How-To-Repeat:
	Export an LFS file system.  Watch the client completely lose
	once the cleaner garbage-collects.

>Fix:
	None provided.  See above.
>Audit-Trail:
>Unformatted: