tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

very bad behavior on overquota writes

I've been looking at performance issues on our NFS server, which I tracked
down to overquota writes. The problem is caused by software that do
writes without error checkings. When doing this, the nfsd threads becomes
100% busy, and nfs requests from other clients can de delayed by
several seconds.
To reproduce this, I've used the attached program. Basically it does an
endless write, without error checking. I first ran it on a NFS client against
a test nfs server and could reproduce the problem. The I ran it
directly on the server against the ffs-exported filesystem, and
could see a similar behavior:
When the uid running it is overquota, the process start using 100% CPU in
system and the number of write syscall per second drops dramatically (from
about 170 to about 20). I can see there is still some write activity on the
disk (about 55 KB/s, 76 writes/s).

The problem is that when we notice we can't do the write, ffs_write() already
did some things that needs to be undone. one of them, which is time consuming,
is to trucate the file back to its original size. Most of the time is
spent in genfs_do_putpages().

The problem here seems to be that we always to a page list walk because
endoff is 0. If the file is large enough to have lots of pages in core,
a lot of time is spent here.

The attached patch improves this a bit, by not always using a list walk.
but I wonder if this could cause some pages to be lost until the vnode
is recycled. AFAIK v_writesize nor v_size can be shrunk without
genfs_do_putpages() being called, but I may have missed something.

I'll also see if we can take shortcuts in ffs_writes at last for some
trivial cases.

Manuel Bouyer <>
     NetBSD: 26 ans d'experience feront toujours la difference
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/time.h>

main(int argc, const char *argv[])
        char *buf[65536];
        int fd;
        int saved_errno = -1;
        struct timeval t1, t2;
        uint64_t       ms1, ms2;
        int i;

        fd = open(argv[1], O_WRONLY|O_CREAT, 0600);
        if (fd < 0) {
        while (1) {
                if (gettimeofday(&t1, NULL) != 0) {
                for (i = 0; i < 1000; i++) {
                        if (write(fd, buf, sizeof(buf)) >= 0) {
                                errno = 0;
                        if (errno != saved_errno) {
                                saved_errno = errno;
                if (gettimeofday(&t2, NULL) != 0) {
                ms1 = (uint64_t)t1.tv_sec * 1000ULL + (uint64_t)t1.tv_usec / 
                ms2 = (uint64_t)t2.tv_sec * 1000ULL + (uint64_t)t2.tv_usec / 
                printf("%f syscalls per second\n", (float)(1000 * 1000) / 
(float)(ms2 - ms1));

Index: genfs_io.c
RCS file: /cvsroot/src/sys/miscfs/genfs/genfs_io.c,v
retrieving revision
diff -u -p -u -r1.53.8.1 genfs_io.c
--- genfs_io.c  7 May 2012 03:01:12 -0000
+++ genfs_io.c  21 Nov 2012 13:59:45 -0000
@@ -892,12 +900,16 @@ retry:
        error = 0;
        wasclean = (vp->v_numoutput == 0);
        off = startoff;
-       if (endoff == 0 || flags & PGO_ALLPAGES) {
-               endoff = trunc_page(LLONG_MAX);
+       if (endoff == 0) {
+               if (flags & PGO_ALLPAGES)
+                       endoff = trunc_page(LLONG_MAX);
+               else 
+                       endoff = MAX(vp->v_writesize, vp->v_size);
        by_list = (uobj->uo_npages <=
            ((endoff - startoff) >> PAGE_SHIFT) * UVM_PAGE_TREE_PENALTY);
         * if this vnode is known not to have dirty pages,
         * don't bother to clean it out.

Home | Main Index | Thread Index | Old Index