tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr



In article <20150619083656.GT19722%homeworld.netbsd.org@localhost>,
Emmanuel Dreyfus  <manu%netbsd.org@localhost> wrote:
>Hi
>
>I have encountered a bug with NetBSD NFS client. Despite a mount with
>-o intr,soft, we can hit situation where a process can remain hang in 
>kernel because the NFS server is gone.
>
>This happens when the ioflush does its duty, with the following code path:
>sync_fsync / nfs_sync / VOP_FSYNC / nfs_fsync / nfs_flush / VOP_PUTPAGES
>
>VOP_PUTPAGES has flags = PGO_ALLPAGES|PGO_FREE. It then goes through
>genfs_putpages and genfs_do_putpages, and get stuck in:
>
>	/* Wait for output to complete. */
>	if (!wasclean && !async && vp->v_numoutput != 0) {
>		while (vp->v_numoutput != 0)
>			cv_wait(&vp->v_cv, slock);
>	}
>
>This cv_wait() is tiemout-less and uninterruptible. ioflush will 
>sleep there forever, holding vnode lock. Any other process doing
>I/O on the filesystem will sleep in tstile waiting for the vnode
>lock with this path: 
>sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter

Yes, but ioflush is not a user process... An interruptible mount
means that a user process can interrupt a syscall doing an NFS
operation. No other operating system I know of, takes this to mean
that you can unmount the filesystem or make delayed writes abort
and fail.

Having said that, yes it is a problem that you need to reboot
because an NFS server is gone, and we should make umount -f work
properly in that case. I don't think that we should introduce umount
-l (like linux) unless there is a compelling reason to do so.

christos



Home | Main Index | Thread Index | Old Index