Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Filesystem tests crashing host



Just a little more info...

The fault is coming from the following code at lines 817-821 of src/sys/kern/kern_descrip.c (rev. 1.212, in which christos@ touched the close-on-exec stuff)

        if (fp->f_ops != NULL) {
                error = (*fp->f_ops->fo_close)(fp);
        } else {
                error = 0;
        }



On Thu, 14 Apr 2011, Paul Goyette wrote:

Please note that I currently have a qemu system stopped in gdb after hitting this bug. I posted the following backtrace yesterday, but it may have been missed:

        uvm_fault(0xffffffff80ca65e0, 0xffffffff80eff000, 1) -> e
        fatal page fault in supervisor mode
        trap type 6 code 0 rip ffffffff80464902 cs 8 rflags 286 cr2
        ffffffff80effaf0 cpl 0 rsp ffff80000b72b9e0
        kernel: page fault trap, code=0
        Stopped in pid 12792.19 (rumpnfsd) at   netbsd:closef+0x5d:
        call    *0x30(%rax)
        ?
        db{0}> bt
        closef() at netbsd:closef+0x5d
        fd_free() at netbsd:fd_free+0xb5
        exit1() at netbsd:exit1+0x10e
        sigexit() at netbsd:sigexit+0x182
        postsig() at netbsd:postsig+0xc5
        lwp_userret() at netbsd:lwp_userret+0x15c
        syscall() at netbsd:syscall+0x13a
        db{0}>

I can keep this environment available indefinitely, so if there's any information you'd like me to collect, just tell me!


On Thu, 14 Apr 2011, Andreas Gustafsson wrote:

Jukka Ruohonen wrote:
I was thinking about the VFS changes too. These panics appear to be related
to NFS somehow. Andreas, are you sure the time frame is correct?

Yes.  Before rmind's changes of 2011.04.11.22.31.43, there hadn't been a
single uvm_fault crash in my test runs since January, and after those
changes, every single test run has failed that way, though not always
in the same test:

 $ zgrep ' uvm_fault' 2011.0[2-4].*/test.log.gz
2011.04.11.22.31.43/test.log.gz: mountdhup: uvm_fault(0xc0b0f3c0, 0xc4ba1000, 1) -> 0xe 2011.04.11.22.37.10/test.log.gz: mountdhup: uvm_fault(0xc0b0f3c0, 0xc4bbf000, 1) -> 0xe 2011.04.12.00.21.10/test.log.gz: nfs_overwrite64k: uvm_fault(0xc0b0f3c0, 0xc4c10000, 1) -> 0xe 2011.04.12.07.54.16/test.log.gz: nfs_renamerace_dirs: uvm_fault(0xc0b0f3c0, 0xc40b5000, 1) -> 0xe 2011.04.12.08.39.26/test.log.gz: nfs_renamerace_dirs: uvm_fault(0xc0b0f3c0, 0xc4c04000, 1) -> 0xe 2011.04.12.08.40.34/test.log.gz: mountdhup: uvm_fault(0xc0b0f3c0, 0xc4b9f000, 1) -> 0xe 2011.04.13.12.40.54/test.log.gz: nfs_fillfs: uvm_fault(0xc0b0f3c0, 0xff29e000, 1) -> 0xe 2011.04.13.19.17.00/test.log.gz: nfs_fillfs: uvm_fault(0xc0b0f3c0, 0xc40b9000, 1) -> 0xe 2011.04.14.01.03.23/test.log.gz: mountdhup: uvm_fault(0xc0b0f3c0, 0xc4c05000, 1) -> 0xe 2011.04.14.07.06.52/test.log.gz: mountdhup: uvm_fault(0xc0b0f3c0, 0xc4b4d000, 1) -> 0xe

If you still doubt me, there's an easy way to prove me wrong: revert
the changes and see if the tests continue failing.
--
Andreas Gustafsson, gson%gson.org@localhost






-------------------------------------------------------------------------
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |                          | pgoyette at netbsd.org  |
-------------------------------------------------------------------------

!DSPAM:4da6eb3c2271010617573!




-------------------------------------------------------------------------
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |                          | pgoyette at netbsd.org  |
-------------------------------------------------------------------------


Home | Main Index | Thread Index | Old Index