Subject: Re: kern/32682: netbsd-3 ptyfs intermittent failure with Matlab
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Christos Zoulas <christos@zoulas.com>
List: netbsd-bugs
Date: 01/31/2006 17:55:02
The following reply was made to PR kern/32682; it has been noted by GNATS.
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/32682: netbsd-3 ptyfs intermittent failure with Matlab
Date: Tue, 31 Jan 2006 12:52:19 -0500
On Jan 31, 5:25pm, hf@spg.tu-darmstadt.de (Hauke Fath) wrote:
-- Subject: kern/32682: netbsd-3 ptyfs intermittent failure with Matlab
| >Number: 32682
| >Category: kern
| >Synopsis: netbsd-3 ptyfs intermittent failure with Matlab
| >Confidential: no
| >Severity: serious
| >Priority: medium
| >Responsible: kern-bug-people
| >State: open
| >Class: sw-bug
| >Submitter-Id: net
| >Arrival-Date: Tue Jan 31 17:25:00 +0000 2006
| >Originator: Hauke Fath <hf@spg.tu-darmstadt.de>
| >Release: NetBSD 3.0_STABLE
| >Organization:
| --
| /~\ The ASCII Ribbon Campaign Hauke Fath
| \ / No HTML/RTF in email Institut für Nachrichtentechnik
| X No Word docs in email TU Darmstadt
| / \ Respect for open standards Ruf +49-6151-16-3281
| >Environment:
|
|
| System: NetBSD Wintersberg 3.0_STABLE NetBSD 3.0_STABLE (SPG_PIII) #1: Mon Jan 23 18:52:48 CET 2006 hf@Heiligenberg:/var/obj/netbsd-builds/3_0/i386/sys/arch/i386/compile/SPG_PIII i386
| Architecture: i386
| Machine: i386
| >Description:
|
| With the pty subsystem that comes with NetBSD 3, Matlab
| expects to find its ptys in /dev/pts. Every once in a while,
| the required pty cannot be created, which results in Matlab 13
| issuing dire warnings ("...no background processes/job
| control/blah"), and Matlab 14 simply aborting.
|
| Sometimes the problem "goes away" after some tens of minutes,
| at other times it needs a reboot to "fix". It is more likely
| to appear with several users logged in on the machine.
|
| The end of a Matlab 14 ktrace looks like
|
| [...]
|
| 883 MATLAB NAMI "/dev/ptmx"
| 883 MATLAB RET open 7
| 883 MATLAB CALL ioctl(7,_IO('T',0x1,0),0xbfbf535c)
| 883 MATLAB RET ioctl 0
| 883 MATLAB CALL ioctl(7,_IOW('T',0x30,0x4),0xbfbf541c)
| 883 MATLAB GIO fd 7 read 40 bytes
| "\^D\0\0\0\^D\0\0\0/dev/null\0\0\0\0\0\0\0/dev/pts/4\0\0\0\0\0\0"
| 883 MATLAB RET ioctl 0
| 883 MATLAB CALL stat64(0xbfbf54f0,0xbfbf5440)
| 883 MATLAB NAMI "/emul/linux/dev/pts/4"
| 883 MATLAB NAMI "/dev/pts/4"
| 883 MATLAB RET stat64 0
| 883 MATLAB CALL statfs(0xbfbf54f0,0xbfbf64f0)
| 883 MATLAB NAMI "/emul/linux/dev/pts/4"
| 883 MATLAB NAMI "/dev/pts/4"
| 883 MATLAB RET statfs 0
| 883 MATLAB CALL ioctl(7,_IOR('T',0x31,0x4),0xbfbf6528)
| 883 MATLAB RET ioctl -1 errno -22 Invalid argument
| 883 MATLAB CALL ioctl(7,_IO('T',0x1,0),0xbfbf63cc)
| 883 MATLAB RET ioctl 0
| 883 MATLAB CALL ioctl(7,_IOW('T',0x30,0x4),0xbfbf648c)
| 883 MATLAB GIO fd 7 read 40 bytes
| "\^D\0\0\0\^D\0\0\0/dev/null\0\0\0\0\0\0\0/dev/pts/4\0\0\0\0\0\0"
| 883 MATLAB RET ioctl 0
| 883 MATLAB CALL stat64(0xbd3dd888,0xbfbf64b0)
| 883 MATLAB NAMI "/emul/linux/dev/pts/4"
| 883 MATLAB NAMI "/dev/pts/4"
| 883 MATLAB RET stat64 0
| 883 MATLAB CALL rt_sigaction(0x11,0xbfbf6200,0xbfbf6170,8)
| 883 MATLAB RET rt_sigaction 0
| 883 MATLAB CALL rt_sigprocmask(1,0xbfbf6380,0,8)
| 883 MATLAB RET rt_sigprocmask 0
| 883 MATLAB CALL open(0xbac662e0,0x8002,0)
| 883 MATLAB NAMI "/emul/linux/dev/pts/4"
| 883 MATLAB NAMI "/dev/pts/4"
| 883 MATLAB RET open -1 errno -13 Permission denied
| 883 MATLAB CALL rt_sigprocmask(1,0xbfbf02b0,0,8)
| 883 MATLAB RET rt_sigprocmask 0
| 883 MATLAB CALL kill(0x373, SIGABRT)
| 883 MATLAB RET kill 0
| 883 MATLAB PSIG SIGABRT SIG_DFL
| 883 MATLAB NAMI "MATLAB.core"
| 27966 MATLAB RET poll 0
| 27966 MATLAB CALL getppid
| 27966 MATLAB RET getppid 1
| 27966 MATLAB CALL kill(0x6db7, SIGKILL)
| 27966 MATLAB RET kill -1 errno -3 No such process
| 27966 MATLAB CALL kill(0x1518, SIGKILL)
| 27966 MATLAB RET kill 0
| 5400 MATLAB RET nanosleep -1 errno -4 Interrupted system call
| 5400 MATLAB PSIG SIGKILL SIG_DFL
| 27966 MATLAB PSIG SIGRT1 caught handler=0xbd4c2eb0 mask=(1,2,3,4,6,8,10,11,12,13,14,15,16,18,19,20,21,22,23,24,25,26,27,28,30,31,32,33))
| 27966 MATLAB CALL sigreturn(0x80a14b4)
| 27966 MATLAB RET sigreturn -1 errno -2 No such file or directory
| 27966 MATLAB CALL exit_group(0)
|
| where
|
| [hf@Wintersberg] /var/tmp > ll /dev/pts
| total 0
| 0 crw-rw-rw- 1 root wheel 5, 0 Jan 29 22:56 0
| 0 crw-rw-rw- 1 root wheel 5, 1 Jan 31 03:15 1
| 0 crw-rw-rw- 1 root wheel 5, 2 Jan 31 00:17 2
| 0 crw--w---- 1 cbrown tty 5, 3 Jan 20 16:48 3
| 0 crw--w---- 1 hf tty 5, 5 Jan 31 18:04 5
| [hf@Wintersberg] /var/tmp >
|
| The Matlab core and ktrace.out are at
| http://www.spg.tu-darmstadt.de/~hf/netbsd/matlab-ptyfs-pr.tar.bz2
| (4.3 MB).
|
| >How-To-Repeat:
|
| Start Matlab 13/14 on a NetBSD/i386 3 machine. Try a few
| times, from different user accounts.
Can you show what w(1) prints and the "interesting" ptys in /dev/[pt]ty??.
I suspect what is going on, is that you have a rogue program that is
opening old style pty's behind the pty subsystem's back, so when ptyfs
tries to open the same pty, it fails. So when it fails for pts/4 for
example, what does lsof say for /dev/{t,p}typ4?
christos