NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/46463: netbsd-6: panic and filesystem corruption running tmux



The following reply was made to PR kern/46463; it has been noted by GNATS.

From: christos%zoulas.com@localhost (Christos Zoulas)
To: gnats-bugs%NetBSD.org@localhost, martin%NetBSD.org@localhost, 
gnats-admin%netbsd.org@localhost, 
        netbsd-bugs%netbsd.org@localhost, rhansen%bbn.com@localhost
Cc: 
Subject: Re: kern/46463: netbsd-6:  panic and filesystem corruption running tmux
Date: Thu, 31 May 2012 09:18:53 -0400

 On May 31, 10:10am, martin%duskware.de@localhost (Martin Husemann) wrote:
 -- Subject: Re: kern/46463: netbsd-6:  panic and filesystem corruption runnin
 
 | The following reply was made to PR kern/46463; it has been noted by GNATS.
 | 
 | From: Martin Husemann <martin%duskware.de@localhost>
 | To: gnats-bugs%NetBSD.org@localhost
 | Cc: 
 | Subject: Re: kern/46463: netbsd-6:  panic and filesystem corruption running 
tmux
 | Date: Thu, 31 May 2012 12:08:22 +0200
 | 
 |  So it turns out (subject to interpretation in detail and to be discussed
 |  on tech-kern) that we see a side effect of two bugs here.
 |  
 |  First, at the application level, in tmux, there is something wrong.
 |  The client attaching to a already established session does:
 |   (1) call kqueue() to be able to receive kevents
 |   (2) connect via a local socket to the already running tmux instance
 |   (3) pass stdin/stdout/stderr vis sendmsg with SCM_RIGHTS to the session mux
 |  
 |  Now in the case that triggered this PR, tmux was invoked with stdin closed.
 |  Not suprising, kqueue() in (1) returns 0. Later in (3), this file descriptor
 |  is passed to the mux (which expects to receive stdin).
 |  
 |  So, adding something like this:
 |  
 |    while ((fd = open("/dev/null", O_RDWR)) < 3)
 |       ;
 |    close(fd);
 |  
 |  early in tmux would make sure this does not happen, replacing the missing
 |  descriptors with /dev/null.
 |  
 |  Now for the kernel part. A kqueue was intended to be a local-proc-only 
thing.
 |  There is no sense passing it from one process to another. That is why they
 |  are not inherited during fork() for example. However, and I think this is a
 |  bug, they can be passed via SCM_RIGHTS messages.
 |  
 |  I can reproduce the crash with this test program:
 |  
 |  #include <stdio.h>
 |  #include <stdlib.h>
 |  #include <string.h>
 |  #include <unistd.h>
 |  #include <err.h>
 |  #include <sys/event.h>
 |  #include <sys/time.h>
 |  #include <sys/socket.h>
 |  #include <sys/wait.h>
 |  
 |  static int
 |  child_proc(int fd)
 |  {
 |      int kq, storage = -1;
 |      struct cmsghdr *msg;
 |      struct iovec iov;
 |      struct msghdr m;
 |  
 |      msg = malloc(CMSG_SPACE(sizeof(int)));
 |      iov.iov_base = &storage;
 |      iov.iov_len = sizeof(int);
 |      m.msg_iov = &iov;
 |      m.msg_iovlen = 1;
 |  
 |      if (recvmsg(fd, &m, 0) == -1)
 |              err(1, "child: could not recvmsg");
 |  
 |      kq = *(int *)CMSG_DATA(msg);
 |      printf("child (pid %d): received kq fd %d\n", getpid(), kq);
 |  
 |      return 0;
 |  }
 |  
 |  int main(void)
 |  {
 |      pid_t child;
 |      int s[2], storage, status, kq;
 |      struct cmsghdr *msg;
 |      struct iovec iov;
 |      struct msghdr m;
 |      struct kevent ev;
 |  
 |      kq = kqueue();
 |      if (kq < 0)
 |              err(1, "could not create kqueue");
 |  
 |      if (socketpair(AF_LOCAL, SOCK_STREAM, 0, s) == -1)
 |              err(1, "could not create a socket pair");
 |  
 |  
 |      child = fork();
 |      if (child == 0) {
 |              close(s[0]);
 |              return child_proc(s[1]);
 |      }
 |      close(s[1]);
 |  
 |      msg = malloc(CMSG_SPACE(sizeof(int)));
 |      iov.iov_base = &storage;
 |      iov.iov_len = sizeof(int);
 |  
 |      msg->cmsg_level = SOL_SOCKET;
 |      msg->cmsg_type = SCM_RIGHTS;
 |      msg->cmsg_len = CMSG_LEN(sizeof(int));
 |  
 |      m.msg_iov = &iov;
 |      m.msg_iovlen = 1;
 |      m.msg_name = NULL;
 |      m.msg_namelen = 0;
 |      m.msg_control = msg;
 |      m.msg_controllen = CMSG_SPACE(sizeof(int));
 |      *(int *)CMSG_DATA(msg) = kq;
 |  
 |      EV_SET(&ev, 1, EVFILT_TIMER, EV_ADD|EV_ENABLE, 0, 1, 0);
 |      if (kevent(kq, &ev, 1, NULL, 0, NULL) == -1)
 |              err(1, "could not add timer event");
 |  
 |      printf("parent (pid %d): sending kq fd %d\n", getpid(), kq);
 |      if (sendmsg(s[0], &m, 0) == -1)
 |              err(1, "could not sendmsg");
 |  
 |      close(kq);
 |  
 |      waitpid(child, &status, 0);
 |  
 |      return 0;
 |  }
 |  
 |  
 |  IMHO the correct fix would be to filter kqueue descriptors out of SCM_RIGHTS
 |  at the sending side and document this (in unp_internalize break out and
 |  return EPERM if there is a file_t with f_type == DTYPE_KQUEUE).
 
 What's the harm in letting them through and making them work properly in
 the new process?
 
 christos
 


Home | Main Index | Thread Index | Old Index