NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Very poor NFS I/O





On 28/02/2026 10:25 pm, BERTRAND Joël wrote:
MJ a écrit :


On 28/02/2026 8:16 pm, BERTRAND Joël wrote:
MJ a écrit :


On 28/02/2026 8:21 am, BERTRAND Joël wrote:
Michael van Elst a écrit :
joel.bertrand%systella.fr@localhost (=?UTF-8?Q?BERTRAND_Jo=c3=abl?=) writes:

      CPU is an i7-4770, main memory 16 GB. This server exports
/srv and
/home through NFS (V3/TCP, 128 threads, async) and disk I/O from NFS
clients are very slow. Server load can raise until 110 or 120 during
huge NFS access.

Can you quantify what "slow" means? Any kind of benchmarks?

      Less than 2 MB/s.

The server load just shows that the NFS requests are distributed over
your server threads. But that is not related to any CPU utilization.

      I know. NFS process never reaches 35% of one core.



As per Michael's email, re: isolating disk and client, can you isolate
which system it is slow serving to? I think that is important.

     Same result with Linux and FreeBSD. If I reduce number of nfs threads
(currently 128), it seems to run better but I obtain on client side "nfs
server not responding".

     Now, I only have two client (a Linux and a FreeBSD). FreeBSD is idle
and I do some tests on Linux workstation.

1/ Linux rootfs is on a Raid1 disk on NetBSD server.
2/ iftop shows that nfs server is idle too (a few Kbps).
3/ apt update && apt dist-upgrade is very slow (-dev packages with a lot
of little files).
4/ make -j1 kicad (9.0), sources on a Raid5 volume shows a nfs mean
throughput around 40 Mbps. nfs process eats 1 to 2 % od server CPU.
5/ now, I start another compilation and load is rising on server side :

load averages:  10.0,  3.17,  2.06;        up 16+01:28:30       10:10:27
93 processes: 92 sleeping, 1 on CPU
CPU states: 0.1% user, 0.0% nice, 0.1% system, 0.0% interrupt, 99.7% idle
Memory: 7838M Act, 3937M Inact, 49M Wired, 153M Exec, 9617M File, 84M
Free
Swap: 16G Total, 16G Free / Pools: 3708M Used / Network: 2647K In,
8700K Out

    PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU
COMMAND
   4326 root      85    0   600M 8248K nfsd/4    823:01  7.13%  7.13% nfsd
   2553 root      85    0    20M 2640K kqueue/3    2:59  4.88%  4.88%
syslogd
      0 root     221    0     0K   64M rfnode/3  803:17  4.10%  4.10%
[system]
    930 root      85    0    12M 1900K select/0    3:26  1.71%  1.71%
rpc.lockd


6/ dd if=/dev/zero of=test.dd count=10 bs=100M
nfs  throughput rises until 850 Mbps (iftop)
but load average on server side until 56 !

hilbert:[~] > dd if=/dev/zero of=test.dd count=10 bs=100M
10+0 enregistrements lus
10+0 enregistrements écrits
1048576000 octets (1,0 GB, 1000 MiB) copiés, 69,9613 s, 15,0 MB/s


If the load is going up but not CPU usage, perhaps something is blocking
a process? Lock contention?

If you do a process list (PS) are there any processes in D state on the
server when you perform work over NFS?

	Only nfsd and kernel :

legendre# ps auwx | grep ' D'
root       23258 68.9  0.1  614028   8312 ?      Dsl  12:14PM    0:36.32
/usr/sbin/nfsd -n 128
root           0  0.1  1.1       0 182868 ?      DKl  12Feb26  807:12.02
[system]


Yikes. Have you tried running the server with far less threads? I, personally, can't see the benefit of such a high amount of threads.
This is probably inviting contention.

My usual rule-of-thumb was (2 * cores ) + 2. Not scientific, just historically, "it worked".

So, have you tried the server running -n 16 or -n 20 or similar.

...
Max



Home | Main Index | Thread Index | Old Index