Subject: Re: NFS timeo and retrans parameters via amd
To: None <netbsd-users@NetBSD.org>
From: Aaron J. Grier <agrier@poofygoof.com>
List: netbsd-users
Date: 05/02/2005 11:10:18
On Sat, Apr 30, 2005 at 02:33:51PM +0000, Matthias Scheler wrote:
> In article <20050426065142.GJ27190@arwen.poofy.goof.com>,
> 	"Aaron J. Grier" <agrier@poofygoof.com> writes:
> > $ ypmatch /defaults amd.net
> > opts:=intr,retrans=500,timeo=10
> 
> Are you sure you need those? I would bet that "opts:=tcp" will fix
> your problem.

according to netstat and tcpdump, the mounts are already over tcp.

I went to a uniprocessor kernel yesterday, thinking it might be MP
related.  still seeing the problem.

May  1 23:25:18 radbug /netbsd: nfs server arwen:/usr/home: not responding
May  1 23:40:02 radbug last message repeated 5 times
May  1 23:56:55 radbug /netbsd: nfs server arwen:/usr/home: not responding
May  2 00:47:57 radbug /netbsd: nfs server pid200@radbug:/net: not responding
May  2 01:56:44 radbug /netbsd: nfs server arwen:/usr/home: is alive again
May  2 01:56:44 radbug last message repeated 3 times
May  2 01:56:45 radbug /netbsd: nfs server pid200@radbug:/net: is alive again

that's a two hour NFS outage...  neither client nor server machines were
down during that time.  network connectivity was not down during that
time either.

radbug$ nfsstat -c
Client Info:
[...]
RPC Info:
   timeout         invalid      unexpected         retries     requests
         0               0               6              95     1405375

I'm still scratching my head.  would tech-kern or tech-net be more
appropriate for further discussion?

-- 
  Aaron J. Grier | "Not your ordinary poofy goof." | agrier@poofygoof.com