NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: cannot start detached sessions (with -m -d) back to back



On Thu, Dec 30, 2021 at 06:55:24 +0300, Valery Ushakov wrote:

> Building screeen with debugging shows that succesful session start has
> for the first read from the window:
> 
>      + hit ev fd 5 type 1!
>     going to read from window fd 5
>      -> 5 bytes
> 
> but failed attempt has
> 
>      + hit ev fd 5 type 1!
>     going to read from window fd 5
>     Window 0: EOF - killing window
> 
> where fd 5 is obtained from the cloning pty device /dev/ptmx (ptm(4))
> 
> The comment in ptcread says:
> 
> 	/*
> 	 * We want to block until the slave
> 	 * is open, and there's something to read;
> 	 * but if we lost the slave or we're NBIO,
> 	 * then return the appropriate error instead.
> 	 */
> ...
> 		if (!ISSET(tp->t_state, TS_CARR_ON)) {
> 			error = 0;   /* EOF */
> 			goto out;
> 		}

I think screen is racing against the child process in MakeWindow.
From a quick look it seems that the parent opens the master side and
saves the slave name.  Then it calls ForkWindow and adds the master fd
to the list of descriptors to poll.  Now the race is on, b/c it takes
some time for the child to open the slave side, and if the parent wins
the race, it will get EOF from the master, as the slave is not open
yet.  E.g. a failed attempt:

$ kdump | sed -n -e '/select/p' -e '/EOF/p' \
                 -e '/\/dev\/pts/{' -e 'N' -e '/open/p' -e '}'
  4831   4831 screen   CALL  __select50(0x100,0x7f7fff3739f0,0x7f7fff373a10,0,0)
  4831   4831 screen   RET   __select50 1
       "Window 0: EOF (errno 0) - killing window\n"
  5218   5218 screen   NAMI  "/dev/pts/7"
  5218   5218 screen   RET   open 0


vs. a succesful one:


 19165  19165 screen   CALL  __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0)
 26839  26839 screen   NAMI  "/dev/pts/7"
 26839  26839 screen   RET   open 0
 19165  19165 screen   RET   __select50 1
       "serv_select_fn called\n"
 19165  19165 screen   CALL  __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0)
 19165  19165 screen   RET   __select50 1


Using brute force to make screen pre-open the slave in the parent
process should take care of the race

--- pty.c~	2021-12-29 23:50:37.231129335 +0300
+++ pty.c	2021-12-31 03:11:48.652558852 +0300
@@ -288,6 +288,7 @@ char **ttyn;
     }
   initmaster(f);
   *ttyn = TtyName;
+  pty_preopen = 1;		/* XXX: uwe */
   return f;
 }
 #endif


This is the HAVE_SVR4_PTYS version of OpenPTY (yes, screen still uses
k&r function definitions :)

-uwe


Home | Main Index | Thread Index | Old Index