netbsd-users: Re: NetBSD-1.5 and NFS

Subject: Re: NetBSD-1.5 and NFS - any suggestions?
To: Frank van der Linden <fvdl@wasabisystems.com>
From: Artem Belevich <art@riverstonenet.com>
List: netbsd-users
Date: 01/25/2002 15:15:14

On Fri, Jan 25, 2002 at 11:50:20PM +0100, Frank van der Linden <fvdl@wasabisystems.com> wrote:
> [Your Reply-to is set to art@riverstonenet.org, which doesn't work,
>  since a DNS lookup for an MX record on it fails. Anyway, you may
>  have seen this message as sent to netbsd-users, but here it is
>  again]

Argh! My fault. I've added reply-to manually and made a mistake.

> On Fri, Jan 25, 2002 at 12:40:55PM -0800, Artem Belevich wrote:
> > The problem is that under heavy NFS load (mostly as NFS client with
> > NetAPP filer as a server) the boxes lock up for up to several
> > minutes.
> Check either /var/log/messages for complaints from the drivers,

None.

> or the output for nfsstat -c for 'errors' or 'retries'. If

Client side shows only a few NFS retries and that number does not
increase during pauses. I wrote a script that dumps nfsstat output
after each pause. Most of the time error count stays constant.
Here's what nfsstat gives me now:

	Client Info:
	[ irrelevant stats skipped ]
	Rpc Info:
 	  TimedOut   Invalid X Replies   Retries  Requests
        	 0         0        64        76    805254

	Server Info:
	[irrelevant server stats skipped]
	Server Ret-Failed
            	    82132
	Server Faults
            	    0

Server Ret-Failed looks weird, though. Sometimes it stays constant
during pauses, sometimes it jumps about 10K or so. What exactly
"Ret-Failed" stands for anyway? Could that counter jump because nfsd
was stuck along with other processes and therefore couldn't reply in
time? 

> this somehow shows that packets are getting dumped by something
> in this setup (could be any piece of networking hardware in your
> path), try one of: 1) reduce the read/write size for the NFS
> mount (try 16k or 8k), 

Didn't help. That was the first thing I tried.

> use TCP. See mount_nfs(8) for details.
Unfortunately, this is not an option. 

#mount -o tcp server:/some/exported/path /mnt 
server:/some/exported/path: nfsd: RPCPROG_NFS: RPC: Program not registered

--Artem