Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Severe netbsd-6 NFS server-side performance issues



All,

we have this netbsd-6 i386 nfs file server that, seemingly out of the blue,
a few weeks ago decided to not perform...

After an upgrade of the raid controller to a PCI-X MegaRAID 320-4x, some
tweaking

[sysctl.conf]
kern.maxvnodes=1048576
kern.somaxkva=16777216
kern.sbmax=16777216
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.sendbuf_max=16777216
net.inet.udp.sendspace=262144
net.inet.udp.recvspace=1048576

[kernel]
options         NVNODE=524288
options         NKMEMPAGES_MAX=131072   # for 450 MB arena limit
options         SB_MAX=1048576          # maximum socket buffer size
options         SOMAXKVA=16777216       # 16 MB -> 64 MB
options         TCP_SENDSPACE=262144    # default send socket buffer size
options         TCP_RECVSPACE=262144    # default recv socket buffer size
options         NMBCLUSTERS=65536       # maximum number of mbuf clusters

and straightening out a few kinks (kern/46136) we had seen 60+ MBytes/sec
i/o under network load from the machine according to 'sysstat vmstat', with
ample spare bandwidth left.

The machine serves from a 1600 GB RAID-5 array (7x 300 GB + hot spare,
write-back) whose bandwidth is more like (bonnie++)

Version 1.03e       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
server          16G 38982  14 38414   7 32969   7 187933  89 288509  32
683.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  6392  97 +++++ +++ +++++ +++  6698  99  7841  91 16432  99

-- "good enough". Serving nfs i/o demands (udp, mostly) over GB ethernet

% ifconfig wm0
wm0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500

capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx,TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>

enabled=3f80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx,TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx>
        address: 00:30:48:d7:0a:78
        media: Ethernet autoselect (1000baseT full-duplex)
        status: active
        inet 130.83.xx.yy netmask 0xfffffff0 broadcast 130.83.xx.zz
        inet6 fe80::230:48ff:fed7:a78%wm0 prefixlen 64 scopeid 0x1
%

to ~30 clients (ubuntu 10) the i/o bandwidth according to 'sysstat vmstat'
currently reaches 100% at a meagre ~10 MBytes/sec. As a result, the machine
eventually clogs up, and has to be rebooted by a cron job which checks for
nfsd in state 'D' for an extended time.

Going back to an old kernel didn't change the problem, indicating it's not
just a funny kernel parameter. OTOH, there is not much to tweak in userland
for an nfs server.

My question to you: What in the system loses the machine's performance and
where? What parameters am I missing, what knobs beside the ones above could
I twist?

Comments much appreciated,

hauke


-- 
     The ASCII Ribbon Campaign                    Hauke Fath
()     No HTML/RTF in email            Institut für Nachrichtentechnik
/\     No Word docs in email                     TU Darmstadt
     Respect for open standards              Ruf +49-6151-16-3281


Home | Main Index | Thread Index | Old Index