Subject: Re: nfs servers and 5 minute VOP_READ's
To: Roger Brooks <R.S.Brooks@liverpool.ac.uk>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 03/16/1999 12:28:19
On Tue, 16 Mar 1999, Roger Brooks wrote:

> >Not sure. Is there a way that the server can say, "I got your request,
> >but I'm too busy now, try again in a little bit." ??
> 
> Isn't this what NFSERR_JUKEBOX is for?

Not being familiar with the error, I'm not sure. :-)

> AFAIK, the protocol goes something like this:
> 
> Client sends a request.
> Server starts loading tape/optical disk/whatever.
> Client resends request.
> Server notices that this is a repeat of an earlier request which is already
> in the "slow queue", and replies NFSERR_JUKEBOX (= "be patient, I'll send
> the response eventually").
> Client shuts up and waits.
> Server completes request and sends response to client.

Ick. That's not quite the protocol I was hoping for.

I was hoping for something more like:

Client sends a request.
Server sees it can't service
Server tells client to try again later
Client keeps request around, and sleeps on it.
Client tries again.
Server sees if it can service. If not, server says try later.

Actually, looking at nfsproto.h, NFSERR_JUKEBOX is defined in the version
3 only section, and NFSERR_TRYLATER is defined as the same. Thus what I
think I want to have happen is already there. nfs_socket.c seems to do the
right thing.

I can re-phrase what I'm asking for as:

if we're on a v3 mount, add IO_NDELAY to reads & writes. If they return
EAGAIN (which is the same as EWOULDBLOCK), the server sends back
NFSERR_TRYLATER.

Here's a patch which I think would do this. Comments?

Take care,

Bill


>From wrstuden@nas.nasa.gov Tue Mar 16 12:13:09 1999
Date: Tue, 16 Mar 1999 10:48:15 -0800 (PST)
From: Bill Studenmund <wrstuden@nas.nasa.gov>
To: wrstuden@sally.nas.nasa.gov

--- /sys/nfs/nfs_serv.c	Tue Feb 16 17:24:56 1999
+++ nfs_serv.c	Tue Mar 16 10:47:37 1999
@@ -702,7 +702,13 @@
 		uiop->uio_resid = cnt;
 		uiop->uio_rw = UIO_READ;
 		uiop->uio_segflg = UIO_SYSSPACE;
-		error = VOP_READ(vp, uiop, IO_NODELOCKED, cred);
+		if (v3) {
+			error = VOP_READ(vp, uiop,
+					IO_NODELOCKED | IO_NDELAY, cred);
+			if (error == EAGAIN)
+				error = NFSERR_TRYLATER;
+		} else
+			error = VOP_READ(vp, uiop, IO_NODELOCKED, cred);
 		off = uiop->uio_offset;
 		FREE((caddr_t)iv2, M_TEMP);
 		if (error || (getret = VOP_GETATTR(vp, &va, cred, procp)) != 0){
@@ -883,7 +889,12 @@
 	    uiop->uio_segflg = UIO_SYSSPACE;
 	    uiop->uio_procp = (struct proc *)0;
 	    uiop->uio_offset = off;
-	    error = VOP_WRITE(vp, uiop, ioflags, cred);
+	    if (v3) {
+	    	error = VOP_WRITE(vp, uiop, ioflags | IO_NDELAY, cred);
+		if (error == EAGAIN)
+			error = NFSERR_TRYLATER;
+	    } else
+	    	error = VOP_WRITE(vp, uiop, ioflags, cred);
 	    nfsstats.srvvop_writes++;
 	    FREE((caddr_t)iv, M_TEMP);
 	}