NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/41566: pty(4) handling under NetBSD-5 is broken



>Number:         41566
>Category:       kern
>Synopsis:       It is possible for pty(4) master and slave processes to 
>deadlock causing the processes to  get stuck forever.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jun 10 07:50:00 +0000 2009
>Originator:     Brian Buhrow
>Release:        NetBSD 5.0_STABLE
>Organization:
        NFB of California
>Environment:
        
        
System: NetBSD arathorn.via.net 5.0 NetBSD 5.0 (GENERIC) #0: Sun Apr 26
18:50:08 UTC 200
9
builds%b6.netbsd.org@localhost:/home/builds/ab/netbsd-5-0-RELEASE/i386/200904260229Z-ob
j/home/builds/ab/netbsd-5-0-RELEASE/src/sys/arch/i386/compile/GENERIC i386

Architecture: i386
Machine: i386
>Description:
        
        the problem seems to be that if the master process is writing a a lot
of data to a pty, the slave process on the corresponding tty can fall
behind, causing something in the kernel to stop processing either master or
slave.
The following script shows the problem, and how to repeat it using the test
program provided below.  Note that if this program is run under NetBSD-4.x
or earlier, it runs forever, printing input and output lines until it is
manually terminated, which is what it should do.
        Under NetBSD-5, however, it gets stuck in ttyraw according to ps -l
output, as shown.
        I do not know exactly how to see further into the kernel to see what
is going wrong, but I'm hoping that this pr will, along with the test
program which fails reliably under NetBSD-5 and works reliably under all
other versions of NetBSD, will inspire some assistance in this effort.
        This appears to me to be a serious bug which should be addressed and
then pulled up to the NetBSD-5 branch as soon as possible.

-thanks
-Brian

Script started on Wed Jun 10 00:25:44 2009
%./ptytest&
[1] 1756
%./ptytest: Master process(1756) is writing to slave process (3339)
./ptytest: Using pty /dev/ttyp2
3339: Read 26 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 28 bytes to master pty
3339: Read 26 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 52 bytes to master pty
3339: Read 50 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 76 bytes to master pty
3339: Read 74 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 100 bytes to master pty
3339: Read 98 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 124 bytes to master pty
3339: Read 122 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 148 bytes to master pty
3339: Read 146 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 172 bytes to master pty
3339: Read 170 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 196 bytes to master pty
3339: Read 194 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 220 bytes to master pty
3339: Read 218 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 244 bytes to master pty
3339: Read 242 bytes from pty
3339: Read 2 bytes from pty
1756: Wrote 268 bytes to master pty
3339: Read 266 bytes from pty
1756: Wrote 28 bytes to master pty
1756: Wrote 52 bytes to master pty
1756: Wrote 76 bytes to master pty
1756: Wrote 100 bytes to master pty
1756: Wrote 124 bytes to master pty
1756: Wrote 148 bytes to master pty
1756: Wrote 172 bytes to master pty
1756: Wrote 196 bytes to master pty
3339: Read 2 bytes from pty
3339: Read 26 bytes from pty
3339: Read 2 bytes from pty
3339: Read 50 bytes from pty
3339: Read 2 bytes from pty
3339: Read 74 bytes from pty
3339: Read 2 bytes from pty
3339: Read 98 bytes from pty
3339: Read 2 bytes from pty
3339: Read 122 bytes from pty
3339: Read 2 bytes from pty
3339: Read 146 bytes from pty
3339: Read 2 bytes from pty
3339: Read 170 bytes from pty
3339: Read 2 bytes from pty
3339: Read 194 bytes from pty
3339: Read 2 bytes from pty

[processes hang at this point]

ps -l1756
UID  PID PPID CPU PRI NI  VSZ RSS WCHAN  STAT TTY      TIME COMMAND
100 1756 3460   0  85  0 2896 776 ttyraw S    ttyp1 0:00.01 ./ptytest 
%ps -l3339
UID  PID PPID CPU PRI NI  VSZ RSS WCHAN  STAT TTY      TIME COMMAND
100 3339 1756   0  85  0 2896 644 ttyraw S    ttyp1 0:00.00 ./ptytest 
%
%pstat -t |grep 'ttyp2'
ttyp2   124  0 1024 1248 256     82 OC            0     0 termios
%fg
./ptytest
%exit
%exit

Script done on Wed Jun 10 00:27:23 2009

>How-To-Repeat:
        

        I don't know how to fix the problem, but the following test program,
who's output is shown above, reliably reproduces the problem for me  on
every NetBSD-5 system I've tried.

To compile:
cc -O -o ptytest ptytest.c -lutil

/**************************************************************************
NAME: Brian Buhrow
DATE: June 9, 2009
PURPOSE: The purpose of this test program is to see if we can figure out
why ptys don't seem to work right under NetBSD-5.x.  There seems to be some
sort of deadlock issue between the pty master and the slave under certain
conditions, where the pty gets data between the master and the slave, and
each is waiting for someting to happen.
The master is waiting for a write(2) to complete, and the slave is waiting
for read(2) to complete.
This works fine under NetBSD-4.x and earlier, but NetBSD-5 seems to have a
problem.
This is a test program which should easily reproduce the problem.
**************************************************************************/

#ifndef LINT
static char rcsid[] = "$Id$";
#endif /*LINT*/

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <util.h>
#include <sys/types.h>

/*Slave reading process*/

int slave(int slavefd)
{
        char buf[512], *ptr;
        int bytesread;
        pid_t slpid;

        slpid = getpid();
        while(1) {
                bytesread = read(slavefd, buf, sizeof(buf));
                printf("%d: Read %d bytes from pty\n",slpid,bytesread);
                if (bytesread < 0) {
                        perror("Error eading from pty");
                        exit(1);
                }
        }

        exit(0); /*not reached*/
}

/*Master writing process*/
int master(int masterfd)
{
        char buf[512], *ptr;
        int outbytes,i;
        pid_t curpid;

        curpid = getpid();

        while(1) {
                sprintf(buf, "q {Subject: June Monitor}\rd\r");
                for (i = 0;i < 11;i ++) {
                        outbytes = write(masterfd, buf, strlen(buf));
                        ptr = buf;
                        strncat(buf, ptr, 24);
                        printf("%d: Wrote %d bytes to master 
pty\n",curpid,outbytes);
                        if (outbytes < 0) {
                                perror("write");
                        }
                }
                bzero(buf, sizeof(buf));
                sleep(5);
        }

        exit(0); /*not reached*/
}
                

main(int argc, char **argv) 
{
        pid_t child;
        int masterfd, slavefd, status;
        char ptyname[256];

        status = openpty(&masterfd, &slavefd, ptyname, NULL, NULL);
        if (status < 0) {
                perror("Openpty");
                exit(1);
        }
        child = fork();
        if (child < 0) {
                perror("fork");
                exit(1);
        }
        if (child) {
                printf("%s: Master process(%d) is writing to slave process 
(%d)\n",
                argv[0],getpid(),child);
                printf("%s: Using pty %s\n",argv[0],ptyname);
                master(masterfd);
        } else {
                slave(slavefd);
        }

        /*not reached*/
        exit(0);
}

>Fix:
        
        Don't know at this time.  Suggestions welcome.

>Unformatted:
        
        


Home | Main Index | Thread Index | Old Index