NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/46463: netbsd-6: panic and filesystem corruption running tmux



The following reply was made to PR kern/46463; it has been noted by GNATS.

From: Greg Oster <oster%cs.usask.ca@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/46463: netbsd-6:  panic and filesystem corruption running
 tmux
Date: Fri, 18 May 2012 09:06:35 -0600

 On Thu, 17 May 2012 23:20:00 +0000 (UTC)
 rhansen%bbn.com@localhost wrote:
 
 > >Number:         46463
 > >Category:       kern
 > >Synopsis:       netbsd-6:  panic and filesystem corruption running
 > >tmux Confidential:   no
 > >Severity:       critical
 > >Priority:       high
 > >Responsible:    kern-bug-people
 > >State:          open
 > >Class:          sw-bug
 > >Submitter-Id:   net
 > >Arrival-Date:   Thu May 17 23:20:00 +0000 2012
 > >Originator:     Richard Hansen
 > >Release:        6.0_BETA
 > >Organization:
 > >Environment:
 > NetBSD 6.0_BETA NetBSD 6.0_BETA (GENERIC) i386
 > >Description:
 > On i386 GENERIC netbsd-6 (nightly build from around 2012-04-28), I
 > get a panic and significant filesystem corruption (ffs with log,
 > noatime) if I do the following:
 > 
 >   1. ssh to the netbsd-6 machine
 >   2. install the tmux-1.4nb1 pkgsrc package
 >   3. run 'tmux new-session`
 >   4. ssh to the same machine from a different terminal
 >   5. run 'tmux list-sessions 0<&-'
 > 
 > For some reason, closing stdin in step #5 above is required to
 > trigger the panic.
 > 
 > Using a lightly modified and older (2012-03-28) snapshot, I get the
 > following backtrace in the close() syscall():
 > 
 > #0  maybe_dump (howto=260)
 > at /usr/src/sys/arch/i386/i386/machdep.c:880 s = 0
 > #1  0xc071c8ca in cpu_reboot (howto=260, bootstr=0x0)
 >     at /usr/src/sys/arch/i386/i386/machdep.c:899
 >         syncdone = false
 >         s = 0
 > #2  0xc09a79c4 in vpanic (fmt=0xc0deb79c "kernel %sassertion \"%s\"
 > failed: file \"%s\", line %d ", ap=0xddcd8b58
 > "&#65533;\267300\024\270300 \267300v\005")
 > at /usr/src/sys/kern/subr_prf.c:308 cii = 0
 >         ci = 0x0
 >         oci = 0x0
 >         bootopt = 260
 >         scratchstr = "kernel diagnostic assertion
 > \"mutex_owned(&fdp->fd_lock)\" failed: file
 > \"/usr/src/sys/kern/kern_event.c\", line 1398 ", '\000' <repeats 97
 > times> #3  0xc0bbb8c3 in kern_assert (fmt=0xc0deb79c "kernel
 > times> %sassertion \"%s\" failed: file \"%s\", line %d ")
 > times> at /usr/src/sys/lib/libkern/kern_assert.c:50 ap = 0xddcd8b58
 > times> "&#65533;\267300\024\270300 \267300v\005"
 > #4  0xc0679e3c in kqueue_doclose (kq=0xc6487a58, list=0xc63b194c,
 > fd=0) at /usr/src/sys/kern/kern_event.c:1398
 >         kn = 0xc0c0
 >         fdp = 0xc64b1cc0
 > #5  0xc0679f00 in kqueue_close (fp=0xc5ecc200)
 > at /usr/src/sys/kern/kern_event.c:1430 kq = 0xc6487a58
 >         fdp = 0xc63b1940
 >         ff = 0xc63b1940
 >         i = 0
 > #6  0xc06737fb in closef (fp=0xc5ecc200)
 > at /usr/src/sys/kern/kern_descrip.c:824 lf = {l_start =
 > -4184506505861690440, l_len = -4160666348713672704, l_pid = 0, l_type
 > = 0, l_whence = 0} error = 0 #7  0xc0673411 in fd_close (fd=13)
 > at /usr/src/sys/kern/kern_descrip.c:709 lf = {l_start =
 > -4629759700944188308, l_len = 28, l_pid = -573731704, l_type =
 > -29420, l_whence = -8755} fdp = 0xc63b1940 ff = 0xc5eda580
 >         fp = 0xc5ecc200
 >         p = 0xc5dea370
 >         l = 0xc6425800
 >         refcnt = 0
 > #8  0xc09c6e5d in sys_close (l=0xc6425800, uap=0xddcd8cec,
 > retval=0xddcd8d14) at /usr/src/sys/kern/sys_descrip.c:486
 > No locals.
 > #9  0xc09d687b in sy_call (sy=0xc0f35608, l=0xc6425800,
 > uap=0xddcd8cec, rval=0xddcd8d14) at /usr/src/sys/sys/syscallvar.h:61
 >         error = 0
 > #10 0xc09d6c1e in syscall (frame=0xddcd8d48)
 > at /usr/src/sys/arch/x86/x86/syscall.c:179 callp = 0xc0f35608
 >         p = 0xc5dea370
 >         l = 0xc6425800
 >         error = 0
 >         code = 6
 >         rval = {0, 0}
 >         rip_call = -1146338251
 >         args = {13, 11, -1077945480, -1145308313, 64, -1077945516,
 > -1077945524, -1077945524, -1147969380, -1077945492} #11 0xc01005d6
 > in ?? () No symbol table info available.
 > Backtrace stopped: previous frame inner to this frame (corrupt stack?)
 > >How-To-Repeat:
 >   1. ssh to the netbsd-6 machine
 >   2. install the tmux-1.4nb1 pkgsrc package
 >   3. run 'tmux new-session`
 >   4. ssh to the same machine from a different terminal
 >   5. run 'tmux list-sessions 0<&-'
 > >Fix:
 
 The following patch from Martin Husemann gets rid of the panic for me.
 
 Any one else have comments on this patch?  (we think it's right, but
 hopefully someone who understands this code better will chime in...)
 
 Later...
 
 Greg Oster
 
 
 Index: kern_event.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/kern_event.c,v
 retrieving revision 1.75
 diff -c -r1.75 kern_event.c
 *** kern_event.c        25 Jan 2012 00:28:35 -0000      1.75
 --- kern_event.c        18 May 2012 15:03:42 -0000
 ***************
 *** 1421,1427 ****
         int i;
   
         kq = fp->f_data;
 !       fdp = curlwp->l_fd;
   
         mutex_enter(&fdp->fd_lock);
         for (i = 0; i <= fdp->fd_lastkqfile; i++) {
 --- 1421,1427 ----
         int i;
   
         kq = fp->f_data;
 !       fdp = kq->kq_fdp;
   
         mutex_enter(&fdp->fd_lock);
         for (i = 0; i <= fdp->fd_lastkqfile; i++) {
 


Home | Main Index | Thread Index | Old Index