netbsd-users: nfsd server usage seems unevenly distributed.

Subject: nfsd server usage seems unevenly distributed.
To: None <netbsd-users@netbsd.org, tech-net@netbsd.org>
From: Stephen M. Jones <smj@cirr.com>
List: netbsd-users
Date: 01/15/2004 16:58:27
I'm wondering if others with high volume nfs traffic see similar numbers to
these.  There are 8 clients and one server that handles all disk
requests (except for local root/tmp/swap on the clients).  Should the servers
be used more evenly or is it typical for one nfs server to be pigged out?

The clients and server are all on the same 100mbit network.  I have
20 servers, the clients and server have vfs.nfs.iothreads set to 32
(is that really even necessary on the server?)

I restarted the nfsd processes about a day and a half ago, and it looks
like this:

    0 28003     1   0   2  0   128      0 netcon   IWs  ??    0:00.00 nfsd: mas
    0 28004 28003   1  -5  0    88    808 biowait  DL   ??  199:01.50 nfsd: ser
    0 28005 28003   0   2  0    88    808 nfsd     SL   ??   50:20.35 nfsd: ser
    0 28006 28003   0   2  0    88    808 nfsd     SL   ??   16:03.24 nfsd: ser
    0 28007 28003   0   2  0    88    808 nfsd     SL   ??    5:22.58 nfsd: ser
    0 28008 28003   0   2  0    88    808 nfsd     SL   ??    2:07.91 nfsd: ser
    0 28009 28003   0   2  0    88    808 nfsd     SL   ??    0:57.50 nfsd: ser
    0 28010 28003   0   2  0    88    808 nfsd     SL   ??    0:28.92 nfsd: ser
    0 28011 28003   0   2  0    88    808 nfsd     SL   ??    0:16.76 nfsd: ser
    0 28012 28003   0   2  0    88    808 nfsd     SL   ??    0:00.85 nfsd: ser
    0 28013 28003   0   2  0    88    808 nfsd     SL   ??    0:10.56 nfsd: ser
    0 28014 28003   0   2  0    88    808 nfsd     SL   ??    0:07.71 nfsd: ser
    0 28015 28003   0   2  0    88    808 nfsd     SL   ??    0:05.17 nfsd: ser
    0 28016 28003   0   2  0    88    808 nfsd     SL   ??    0:03.35 nfsd: ser
    0 28017 28003   0   2  0    88    808 nfsd     SL   ??    0:02.72 nfsd: ser
    0 28018 28003   0   2  0    88    808 nfsd     SL   ??    0:02.05 nfsd: ser
    0 28019 28003   0   2  0    88    808 nfsd     SL   ??    0:01.45 nfsd: ser
    0 28020 28003   0   2  0    88    808 nfsd     SL   ??    0:01.18 nfsd: ser
    0 28021 28003   0   2  0    88    808 nfsd     SL   ??    0:00.88 nfsd: ser
    0 28022 28003   0   2  0    88    808 nfsd     SL   ??    0:01.06 nfsd: ser
    0 28023 28003   0   2  0    88    808 nfsd     SL   ??    0:00.97 nfsd: ser

Server Info:
RPC Counts: (1349693289 calls)
      null         getattr         setattr          lookup          access
         0  0%   432382982 32%    29862093  2%   156910800 11%   211268520 15%
  readlink            read           write          create           mkdir
    114881  0%   344561479 25%    85716861  6%     8129836  0%       81465  0%
   symlink           mknod          remove           rmdir          rename
     45180  0%        9329  0%    11038683  0%       68804  0%      477316  0%
      link         readdir     readdirplus          fsstat          fsinfo
   3030130  0%     8988705  0%           0  0%    55585926  4%        1248  0%
  pathconf          commit        getlease         vacated         evicted
         0  0%     1419051  0%           0  0%           0  0%           0  0%
      noop
         0  0%
Server Errors:
RPC errors          faults
  50855517               0
Server Cache Stats:
inprogress            idem        non-idem          misses
   1453354         1241067           84602      1331644494
Server Lease Stats:
    leases       maxleases       getleases
         0               0               0
Server Write Gathering:
    writes       write RPC       OPs saved
  85715391        85716861            1470  0%
 
I've noticed that many of the clients will have not responding/responding
issues at various busy times .. I've been playing with mount_nfs options
including -r/-w, but have gone back to the defaults.   I've also attempted
to use TCP mounts for one client, which I do not recommend because it 
actually performed much worse.

My question is, is it a big deal to have a pigged out nfsd with all
others nearly idle?  How can requests be a bit more distributed? If nfsd
requests were more distributed would I have less not responding/responding
issues?