Subject: more on nfsio ...
To: None <port-alpha@netbsd.org>
From: Stephen M. Jones <smj@cirr.com>
List: port-alpha
Date: 07/14/2004 20:24:27
I've tried this with 2.0F and it was suggested that I use 2.0G, but I
came up with the same results.

It appears that having iothreads set to 0 gives much better performance
than having any.

I'm starting with a system running no nfsio daemons and then using 
sysctl to swap off a few.  Almost immediately the load increases and
the nfsio daemons go from netio/netrcvlk to a vnlock/nfsclock state.  The
only way to clear these is by going back to single user .. killing them or
using sysctl to set iothreads to 0 doesn't work (though it does if you
can get to it before the state changes).

Also, with nfsio daemons you will more frequently run into delays running
basic commands on NFS mounted filesystems:

$ time ls -la 
load: 5.10  cmd: ls 6680 [vnlock] 0.00u 0.00s 0% 904k
load: 5.10  cmd: ls 6680 [vnlock] 0.00u 0.00s 0% 904k
load: 5.10  cmd: ls 6680 [vnlock] 0.00u 0.00s 0% 904k

   17.73s real     0.00s user     0.00s system

$ time ls -lart 
load: 5.72  cmd: ls 20167 [nfsrcvlk] 0.00u 0.00s 0% 896k
load: 5.72  cmd: ls 20167 [nfsrcvlk] 0.00u 0.00s 0% 896k
load: 5.72  cmd: ls 20167 [vnlock] 0.00u 0.00s 0% 904k
load: 6.31  cmd: ls 20167 [vnlock] 0.00u 0.00s 0% 904k
load: 6.28  cmd: ls 20167 [vnlock] 0.00u 0.00s 0% 904k

   11.30s real     0.00s user     0.00s system

Before setting iothreads to 8, I had my dd writing a file of zeros to
the NFS mounted file system over and over again .. the first result is
with iothreads set to 0, you can see how performance got progressively
worse with time until I switched to single user mode and then back 
with iothreads set to 0 once again

524288000 bytes transferred in 218.497 secs (2399520 bytes/sec)
500+0 records in
500+0 records out
524288000 bytes transferred in 242.588 secs (2161228 bytes/sec)
500+0 records in
500+0 records out
524288000 bytes transferred in 286.098 secs (1832546 bytes/sec)
500+0 records in
500+0 records out
524288000 bytes transferred in 353.918 secs (1481382 bytes/sec)
500+0 records in
500+0 records out
524288000 bytes transferred in 4392.467 secs (119360 bytes/sec)
Terminated

500+0 records in
500+0 records out
524288000 bytes transferred in 192.641 secs (2721580 bytes/sec)

FreeBSD has similar directory listing delays and usually has 4
nfsiod by default (apparently you can't have 0, as sysctl will 
correct you and set it to 1).  The FreeBSD delays are usually 
in an nfsrcvlk state .. and performance does seem to get progressively
worse with the more nfsiods running.  The manual states that 
having no nfsiods is entirely possible, the one that is left 
running usually shows a state of sbwait or nfsrcvlk, but having
only 1 running gives less a delay on user commands that with 4.
And as I stated, you also get a better transfer rate.  But why? 
Isn't it supposed to improve performance on both systems?