NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/46235: beating on ftruncate kills the system



>Number:         46235
>Category:       kern
>Synopsis:       beating on ftruncate kills the system
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 22 00:40:00 +0000 2012
>Originator:     David A. Holland
>Release:        NetBSD 5.99.65
>Organization:
>Environment:
System: NetBSD  5.99.65 NetBSD 5.99.65 (GENERIC) #65: Mon Feb 13 01:38:30 EST 
2012  
dholland@valkyrie:/disk/disk1/home/dholland/netbsd/quotafix/src/sys/arch/i386/compile/obj.i386/GENERIC
 i386
Architecture: i386
Machine: i386
>Description:

I wrote a simple test program to beat on ftruncate. It fairly reliably
kills the system in any of a number of ways. I have seen the following
failures, using a 5.99.65 qemu VM that I had lying around from testing
quota patches. I expect testing on real hardware (faster) will make it
die harder and faster; also I expect using a multiprocessor machine to
have a similar effect.

(1)
fatal double fault in supervisor mode
trap type 13 code 80000000 eip c05656ae cs 8 eflags 202 cr2 c41e4fd0 ilevel 6
panic: trap
cpu0: Begin traceback...
printf_nolog(c0c2be4c,c3576f7c,c3576f7c,c05656ae,8,202,c41e4fd0,6,0,0) at 
netbsd:printf_nolog
trap_tss() at netbsd:trap_tss
--- trap via task gate ---
netbsd:static_vmems+0xe7c:
cpu0: End traceback...

(2)
system hangs; breaking into ddb reveals all or nearly all trunq
processes tstiled. not clear why.

(3)
system hangs; breaking into ddb reveals most trunq processes waiting
on uvn_fp1; a small number each on flt_noram1, tstile, flt_pmfail2.  I
saw one each waiting on vnode and flt_pmfail1. mount_mfs waits on
flt_noram1. cron, syslogd wait on uvn_fp1. (this one was done on an
mfs volume.)

(4)
system hangs; breaking into ddb shows infinite recursion starting
   uvm_fault_internal -> VOP_GETPAGES -> genfs_getpages ->
   uvm_pagermapout -> uvm_unmap_detach -> uvm_mapent_free ->
   pool_cache_put_paddr
and then looping through
   pool_cache_put_paddr -> pool_cache_put_slow -> pool_get ->
   pool_grow -> pool_page_alloc_meta -> vmem_alloc -> vmem_xalloc ->
   uvm_km_kmem_alloc -> pool_cache_put_paddr

(5)
system hangs; breaking into ddb shows many trunq processes waiting on
uvn_fp2, most of them showing no wchan and apparently runnable but not
running. Apparently the scheduler got wedged but it wasn't obvious
what broke.


The thing I was attempting to tickle is the lock release/reacquire
that's at line 354 of uvm_vnode.c in uvm_vnp_setsize(). ISTM that the
window that creates (though small) is dangerous. I'm not sure if I've
managed to hit it yet though.

In the qemu environment i've been using the most reliable way to
trigger a crash is to run trunq on a mfs volume with the default 8M
size. A 16M mfs crashes less but still crashes. (Note that I'm using
only 32M of RAM with qemu. I don't know if that's important, or what
happens with normal RAM sizes.) It is much less crashable on an ffs
volume; I'm not sure if this is because on-disk ffs is slower, or
because it's hitting mfs-specific wobblies, or what. Note that you
can't currently use tmpfs; tmpfs apparently objects to large sparse
files.

I'm also somewhat concerned about how slow trunq runs. ftruncate on
large files (sparse or not) has to do a fair amount of work, but even
so it seems to me that 128 ftruncates (the unit in which trunq prints
progress) should take milliseconds, not seconds. I suspect something
silly is going on, poassibly large thundering herds.

>How-To-Repeat:

trunq -f testfile [-p numprocs] [-w]

You can also change the (otherwise fixed) random seed and the number
of loops done by each process, but there doesn't seem to be much need
to.

It works by forking off a bunch of processes, each of which loops
ftruncating the test file to a random length.

Large numbers of procs (400-800) tend to trigger crash (3); smaller
numbers (30-60ish) tend to trigger crash (2), but not always. I've
only seen crashes (1), (4), and (5) once each so far.

Adding -w, to fork a second batch of processes to write into the test
file at random offsets, doesn't seem to make it noticeably crashier,
at least so far.

   --- trunq.c ---
/*
 * Fork N subprocesses to beat on ftruncate.
 * With -w option, fork N more subprocesses that write into the file.
 */

#include <sys/types.h>
#include <sys/wait.h>
#include <err.h>
#include <fcntl.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

static void loop_ftruncate(int fd, unsigned numloops, unsigned myid) {
   unsigned i;
   off_t len;

   for (i=0; i<numloops; i++) {
      if (i % 256 == 0) {
         write(STDOUT_FILENO, "t", 1);
      }
      len = random();
      if (ftruncate(fd, len) != 0) {
         warn("truncater %d: ftruncate to %lld", myid, (long long)len);
      }
   }
}

static void loop_write(int fd, unsigned numloops, unsigned myid) {
   char buf[64];
   unsigned i;
   off_t pos;
   size_t len;
   ssize_t result;

   for (i=0; i<sizeof(buf); i++) {
      buf[i] = '@' + myid;
   }

   for (i=0; i<numloops; i++) {
      if (i % 256 == 0) {
         write(STDOUT_FILENO, "w", 1);
      }
      pos = random();
      len = random() % sizeof(buf);
      result = pwrite(fd, buf, len, pos);
      if (result < 0) {
         warn("writer #%d: pwrite %zu at %lld", myid, len, (long long)pos);
      }
      else if ((size_t)result != len) {
         warnx("writer #%d: pwrite %zu at %lld: short count %zd of %zu",
              myid, len, (long long)pos, result, len);
      }
   }
}

static void dowait(pid_t pid, const char *what, unsigned id, bool failed) {
   int ret, status;

   ret = waitpid(pid, &status, 0);
   if (ret < 0) {
      warn("%s %u (pid %d): waitpid", what, id, pid);
      return;
   }
   if (WIFEXITED(status)) {
      if (WEXITSTATUS(status) != 0) {
         warnx("%s %u: exit %d", what, id, WEXITSTATUS(status));
      }
   }
   else if (WIFSIGNALED(status)) {
      if (!failed || WTERMSIG(status) != SIGTERM) {
         warnx("%s %u: signal %d", what, id, WTERMSIG(status));
      }
   }
   else {
      warnx("%s %u: weird exit status %d", what, id, status);
   }
}

static void run_processes(int fd, unsigned numloops,
                          unsigned numprocs, bool dowriters) {
   unsigned i;
   pid_t tpids[numprocs];
   pid_t wpids[numprocs];
   unsigned long tseeds[numprocs];
   unsigned long wseeds[numprocs];
   bool failed;

   failed = false;

   for (i=0; i<numprocs; i++) {
      tpids[i] = -1;
      wpids[i] = -1;
   }

   /*
    * Do it this way so adding writers doesn't change the sequence of
    * sizes used by the truncaters.
    */
   for (i=0; i<numprocs; i++) {
      tseeds[i] = random();
   }
   for (i=0; i<numprocs; i++) {
      wseeds[i] = random();
   }

   for (i=0; i<numprocs; i++) {
      tpids[i] = fork();
      if (tpids[i] < 0) {
         failed = true;
         break;
      }
      if (tpids[i] == 0) {
         /* child */
         srandom(tseeds[i]);
         loop_ftruncate(fd, numloops, i);
         exit(0);
      }

      if (dowriters) {
         wpids[i] = fork();
         if (wpids[i] < 0) {
            failed = true;
            break;
         }
         if (wpids[i] == 0) {
            /* child */
            srandom(wseeds[i]);
            loop_write(fd, numloops, i);
            exit(0);
         }
      }
   }

   if (failed) {
      for (i=0; i<numprocs; i++) {
         if (tpids[i] != -1) {
            kill(tpids[i], SIGTERM);
         }
         if (wpids[i] != -1) {
            kill(wpids[i], SIGTERM);
         }
      }
   }

   for (i=0; i<numprocs; i++) {
      dowait(tpids[i], "truncater", i, failed);
      if (dowriters) {
         dowait(wpids[i], "writer", i, failed);
      }
   }
}

static void dotest(const char *file, unsigned numloops,
                   unsigned numprocs, bool dowriters) {
   int fd;

   fd = open(file, O_RDWR|O_CREAT|O_TRUNC, 0664);
   if (fd < 0) {
      err(EXIT_FAILURE, "%s", file);
   }

   run_processes(fd, numloops, numprocs, dowriters);

   if (close(fd)) {
      warn("%s: close", file);
   }
   if (unlink(file)) {
      warn("%s: unlink", file);
   }
}

static void usage(void) {
   fprintf(stderr, "usage: %s [options]\n", getprogname());
   fprintf(stderr, "   -f file      filename to use for testing [testfile]\n");
   fprintf(stderr, "   -l numloops  number of truncate loops to run [4096]\n");
   fprintf(stderr, "   -p numprocs  number of processes to fork [16]\n");
   fprintf(stderr, "   -r randseed  random seed [0x4159471]\n");
   fprintf(stderr, "   -w           write to file as well [false]\n");
}

int main(int argc, char *argv[]) {
   const char *file = "testfile";
   unsigned numloops = 4096;
   unsigned numprocs = 16;
   unsigned long seed = 0x4159471;
   bool dowriters = false;

   int ch;

   while ((ch = getopt(argc, argv, "f:l:r:p:w")) != -1) {
      switch (ch) {
       case 'f': file = optarg; break;
       case 'l': numloops = atoi(optarg); break;
       case 'p': numprocs = atoi(optarg); break;
       case 'r': seed = atol(optarg); break;
       case 'w': dowriters = true; break;
       default: usage(); return EXIT_FAILURE;
      }
   }
   if (optind != argc) {
      usage();
      return EXIT_FAILURE;
   }

   srandom(seed);
   dotest(file, numloops, numprocs, dowriters);
   return EXIT_SUCCESS;
}
   ------

>Fix:

No clue.

If the thing in uvm_vnode.c turns out to be the problem, fixing the
VOP_PUTPAGES interface to not release locks it didn't take (why is it
doing that?) would take care of it, but would probably be kind of
invasive for the -6 branch.

But, who knows if that's the problem... the observed symptoms could be
anything.



Home | Main Index | Thread Index | Old Index