Subject: standards/28493: tcgetpgrp/TIOCGPGRP shouldn't use -1 for "an invalid process ID"
To: None <,,>
From: None <>
List: netbsd-bugs
Date: 12/01/2004 04:18:01
>Number:         28493
>Category:       standards
>Synopsis:       tcgetpgrp/TIOCGPGRP shouldn't use -1 for "an invalid process ID"
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    standards-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 01 04:18:00 +0000 2004
>Originator:     Jed Davis
>Release:        NetBSD 2.0_RC4
System: NetBSD 2.0_RC4 NetBSD 2.0_RC4 (PANIX-USER) #0: Sat Nov 6 19:39:39 EST 2004 i386
Architecture: i386
Machine: i386

Revision 1.162 of src/sys/sys/proc.h, in removing the fixed process
limit, changed the value of NO_PID from being one more than PID_MAX
to being -1 (and also renamed it NO_PGID).  This is used in the
implementation of tcgetpgrp/TIOCGPGRP, in the case that the terminal's
foreground process group no longer exists; e.g., after the last process
in it has exited.

However, SuSv3/POSIX-2004/etc. (I don't have an earlier version on hand; )
says that "[i]f there is no foreground process group, tcgetpgrp() shall
return a value greater than 1 that does not match the process group ID
of any existing process group."  And there exists at least one piece of
software (zsh; see below) that misbehaves when the "greater than 1" part
of that is violated.

Standards notwithstanding, the return value of -1 is of course also
used for errors; distinguishing this from that requires clearing errno


Or, rather, how I came to notice this: zsh assumes the correct behavior
when handling dead children.  The case that causes a problem is where an
external command is followed by a (non-redirected) use of the read
builtin: zsh doesn't reset the foreground process group when the command
exits, then tries to read from the tty with SIGTTIN ignored, and gets
EIO; the read builtin thus returns failure instead of accepting input.

The shell does, however, reset the pgrp before prompting for commands,
and job control is disabled by default in noninteractive shells (e.g.
scripts), so this is not an easy failure mode to trip over.

For example: "while read n; do grep $n /etc/passwd; done" accepts one
line of input and then stops.  More minimally, "/bin/test x; read n".


Change NO_PGID to something positive, and ensure that it can't be used
for an actual pid/pgid.  INT32_MAX?