[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Serious WAPL performance problems
On Oct 23, 6:51pm, Edgar =?iso-8859-1?B?RnXf?= wrote:
} Subject: Serious WAPL performance problems
} We are facing some very serious file system performance problems on 6.0 which
} we attribute to WAPL. Comparable 4.0.1 machines with softdep are performing
} much, much better. Having essentially skipped 5, I cannot easily compare log
} to softdep on identical hardware.
} The most prominent way to trigger the problem is running an svn update
} on a certain repository (having a large number of files) with the working
} mounted over NFS. This will stall the file server's discs to the point where
} you get "NFS server not responding, still trying" messages.
} Tracing that svn update (both ktrace and tcpdump) reveals the unusual thing
} does ist creating some 2,500 .lock files scattered around the directory tree
} only to unlink all of them just seconds later.
} If you run that command with the working copy on a local (WAPL) file system,
} it finishes in under 2 seconds, but running iostat shows that some seconds
} later, the disc (actually a RAID) the fs holding the wc is on is 100% busy
} 18 seconds.
} If you access the same working copy over NFS, the update takes 20 to 30
} seconds. During that period, the discs are initially silent for 5-10 seconds,
} then 100% busy for 8-15 seconds, then silent for 5-7 seconds, busy for 5-10s,
} silent for 7-9s, busy for 17s. In case you didn't add the times: that too
} extends to after the command has finished.
} Running the same command on a 4.0.1 system with the wc on a (local, I didn't
} try NFS) fs with softdeps, it also takes under 2 seconds, but after that, the
} discs are completely silent save a two-second period some ten seconds later.
} There are similar issues (again, on 6 but not on 4) with svn checkout or a
} rm -rf of the wc.
} How to debug/analyze/tune this? While we can move our svn working copies from
} NFS to local storage, this sounds like a problem that can hit other users,
} Btw, PenguinOS's logging seems also not to have this issue: Having the wc on
} ext3 fs also makes the disc busy for just a second or two.
>-- End of excerpt from Edgar =?iso-8859-1?B?RnXf?=
Hello. If possible, I suggest trying the latest 5.1 sources, which
contain the namei fixes David Hollan put into NetBSD-6 as well as allowing
you to compare WAPBL and softdep performance directly. Having said that,
is it possible for you to get the output of ps -lax on the NFS server
during the 18-20 second window of complete busyness? Perhaps that will
tell us why it is that NFS processing ceases while all of the logs are
being played and written to disk.
Main Index |
Thread Index |