tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

NFS vs jumbograms?



At $JOB, we have two i386 machines with wm interfaces connected
back-to-back with a short patch cable and a /30 subnet of 192.168.  One
is NFS-serving some disk space to the other over this link.  They are
running 4.0.1 (with a few tweaks of mine, but nothing touching
sys/nfs); while it's fallen off the end of official support, I thought
someone might happen to recall enough to be useful....

I tried turning on jumbograms ("mtu 9000" when configuring the
interfaces).  It makes a mildly significant performance difference
(about 15% more throughput), but I find that, under circumstances I
haven't entirely explored the boundaries of, readdir() operations will
hang, wedging the operation, possibly the whole NFS subsystem, and in
one case all of userland (though I suspect that happened because of
deadlock between NFS and VM; it happened when userland was writing
large amounts of data over NFS).  The command to provoke it can be as
simple as "echo */*x*".

I used tcpdump to capture network traffic.  I captured on each machine,
and there are some packets missing when I compare the captures;
however, the contents of the lost packets strongly imply that it's a
tcpdump failure (one of them, for example, includes the lookup that
gives the file handle used for the failing readdir).  Here's what the
end of the tcpdupmp looks like:

09:01:32.108640 IP 192.168.255.253.3374830016 > 192.168.255.254.2049: 128 
lookup fh 18,3/549376 "28"
09:01:32.108736 IP 192.168.255.254.2049 > 192.168.255.253.3374830016: reply ok 
236 lookup fh 18,3/550916
09:01:32.108925 IP 192.168.255.253.3374830017 > 192.168.255.254.2049: 124 
access fh 18,3/550916 0001
09:01:32.109007 IP 192.168.255.254.2049 > 192.168.255.253.3374830017: reply ok 
120 access c 0001
09:01:32.109211 IP 192.168.255.253.3374830018 > 192.168.255.254.2049: 120 
getattr fh 18,3/550916
09:01:32.109284 IP 192.168.255.254.2049 > 192.168.255.253.3374830018: reply ok 
112 getattr DIR 755 ids 1000/0 sz 6144
09:01:32.109497 IP 192.168.255.253.3374830019 > 192.168.255.254.2049: 140 
readdir fh 18,3/550916 8192 bytes @ 0
09:01:32.109700 IP 192.168.255.254.2049 > 192.168.255.253.3374830019: reply ok 
8296 readdir
09:01:32.110353 IP 192.168.255.253.3374830020 > 192.168.255.254.2049: 112 
readdir fh 18,3/550916 8192 bytes @ 4668
09:01:32.110506 IP 192.168.255.254.2049 > 192.168.255.253.3374830020: reply ok 
2004 readdir
09:01:32.194778 IP 192.168.255.253.3374830020 > 192.168.255.254.2049: 112 
readdir fh 18,3/550916 8192 bytes @ 4668
09:01:32.194910 IP 192.168.255.254.2049 > 192.168.255.253.3374830020: reply ok 
2004 readdir
09:01:32.364768 IP 192.168.255.253.3374830020 > 192.168.255.254.2049: 112 
readdir fh 18,3/550916 8192 bytes @ 4668
09:01:32.364917 IP 192.168.255.254.2049 > 192.168.255.253.3374830020: reply ok 
2004 readdir
(the last two lines above repeat, with only timestamps changing,
another dozen or so times)

This lockup is entirely repeatable as far as I can tell.  Setting the
MTU on both wm interfaces to 1500 made it go away.  I don't know, of
course, whether there might still be some variant of the condition that
would misbehave even with that MTU, but the stability of
NetBSD-to-NetBSD NFS in my past experience makes it seem unlikely.

The machines are in production use, but they are idle much of each day,
so I probably can try experiments, if anyone has anything to suggest.
I also still have the pcap capture files, in case they might contain
anything of value.  For now, I'm just using MTU 1500.

Any thoughts?

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index