tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr



Chuck Silvers <chuq%chuq.com@localhost> wrote:

> we shouldn't need to change the genfs code to make "soft" work.
> if the underlying RPCs time out and all the retries are exhausted,
> the NFS code should report the error back to the genfs code by doing
> the usual B_ERROR/b_error thing with the buffer, and the genfs code
> should handle that by unlocking pages, etc, just like it would for
> a failed write to a scsi or ata device, and eventually that should
> percolate back up the stack until the cv_wait() returns.
> does this not work currently?

It almost works with a few fixes.

I tried without touching genfs beyond changing cv_wait() into cv_timedwait()
with a debug message, like this:
 
        /* Wait for output to complete. */
        if (!wasclean && !async && vp->v_numoutput != 0) {
-               while (vp->v_numoutput != 0)
-                       cv_wait(&vp->v_cv, slock);
+               while (vp->v_numoutput != 0) {
+                       int cv_error;
+
+                       cv_error = cv_timedwait(&vp->v_cv, slock, 2 * hz);
+                       if (cv_error) {
+                               printf("%s: failed to complete I/O on %s, "
+                                      "vp = %p, numoutput = %d, error = %d\n",
+                                      l->l_name? l->l_name : l->l_proc->p_comm,
+                                      vp->v_mount->mnt_stat.f_mntonname,
+                                      vp, vp->v_numoutput, cv_error);
+                       }
+               
+               }
        }
        onworklst = (vp->v_iflag & VI_ONWORKLST) != 0;
        mutex_exit(slock);
 
I have also changes in NFS code so that RPC on soft mounts can timeout and set
bp->b_error. What happens here is that we loop in the while (vp->v_numoutput !=
0) block with vp->v_numoutput draining down to 2 and then we loop forever with
this value.

I note we have this in genfs_do_io(), and I suspect this is the same 2 value:

        if (iowrite) {
                mutex_enter(vp->v_interlock);
                vp->v_numoutput += 2;
                mutex_exit(vp->v_interlock);  
        }               
        mbp = ge

Why the vp->v_numoutput += 2 ?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index