tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Serious WAPL performance problems
On Oct 23, 6:51pm, Edgar =?iso-8859-1?B?RnXf?= wrote:
} Subject: Serious WAPL performance problems
} We are facing some very serious file system performance problems on 6.0 which
} we attribute to WAPL. Comparable 4.0.1 machines with softdep are performing
} much, much better. Having essentially skipped 5, I cannot easily compare log
} to softdep on identical hardware.
}
} The most prominent way to trigger the problem is running an svn update
command
} on a certain repository (having a large number of files) with the working
copy
} mounted over NFS. This will stall the file server's discs to the point where
} you get "NFS server not responding, still trying" messages.
} Tracing that svn update (both ktrace and tcpdump) reveals the unusual thing
it
} does ist creating some 2,500 .lock files scattered around the directory tree
} only to unlink all of them just seconds later.
} If you run that command with the working copy on a local (WAPL) file system,
} it finishes in under 2 seconds, but running iostat shows that some seconds
} later, the disc (actually a RAID) the fs holding the wc is on is 100% busy
for
} 18 seconds.
} If you access the same working copy over NFS, the update takes 20 to 30
} seconds. During that period, the discs are initially silent for 5-10 seconds,
} then 100% busy for 8-15 seconds, then silent for 5-7 seconds, busy for 5-10s,
} silent for 7-9s, busy for 17s. In case you didn't add the times: that too
} extends to after the command has finished.
} Running the same command on a 4.0.1 system with the wc on a (local, I didn't
} try NFS) fs with softdeps, it also takes under 2 seconds, but after that, the
} discs are completely silent save a two-second period some ten seconds later.
} There are similar issues (again, on 6 but not on 4) with svn checkout or a
} rm -rf of the wc.
}
} How to debug/analyze/tune this? While we can move our svn working copies from
} NFS to local storage, this sounds like a problem that can hit other users,
too.
}
} Btw, PenguinOS's logging seems also not to have this issue: Having the wc on
an
} ext3 fs also makes the disc busy for just a second or two.
>-- End of excerpt from Edgar =?iso-8859-1?B?RnXf?=
Hello. If possible, I suggest trying the latest 5.1 sources, which
contain the namei fixes David Hollan put into NetBSD-6 as well as allowing
you to compare WAPBL and softdep performance directly. Having said that,
is it possible for you to get the output of ps -lax on the NFS server
during the 18-20 second window of complete busyness? Perhaps that will
tell us why it is that NFS processing ceases while all of the logs are
being played and written to disk.
-thanks
-Brian
Home |
Main Index |
Thread Index |
Old Index