PR/57053 CVS commit: src/bin/sh

To: gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,Thomas Klausner <wiz%NetBSD.org@localhost>
Subject: PR/57053 CVS commit: src/bin/sh
From: "Robert Elz" <kre%netbsd.org@localhost>
Date: Sun, 30 Oct 2022 01:50:01 +0000 (UTC)

The following reply was made to PR bin/57053; it has been noted by GNATS.

From: "Robert Elz" <kre%netbsd.org@localhost>
To: gnats-bugs%gnats.NetBSD.org@localhost
Cc: 
Subject: PR/57053 CVS commit: src/bin/sh
Date: Sun, 30 Oct 2022 01:46:17 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Sun Oct 30 01:46:17 UTC 2022

 Modified Files:
 	src/bin/sh: jobs.c

 Log Message:
 PR bin/57053 is related (peripherally) here.

 sh has been remembering the process group of a job for a while now, but
 using that for almost nothing.

 The old way to resume a job, was to try each pid in the job with a
 SIGCONT (using it as the process group identifier via killpg()) until
 one worked (or none did, in which case resuming would be impossible,
 but that never actually happened).   This wasn't as bad as it seems,
 as in practice the first process attempted was *always* the correct
 one.  Why the loop was considered necessary I am not sure.  Nothing
 but the first could possibly work.

 This worked until a fix for an obscure possible bug was added a
 while ago - now a process which has already finished, and had its
 zombie collected via wait*() is no longer ever considered to have
 a pid which is a candidate for use in any system call.  That's
 because the kernel might have reassigned that pid for some newly
 created process (we have no idea how much time might have passed
 since the pid was returned to the kernel for reuse, it might have
 happened weeks ago).

 This is where the example in bin/57053 revealed a problem.

 That PR is really about a quite different problem in zsh (from pksrc)
 and should be pkg/57053, but as the test case also hit the problem
 here, it was assumed (by some) they were the same issue.

 The example is (in a small directory)
 	ls | less
 which is then suspended (^Z), and resumed (fg).   Since the directory
 is small, ls will be finished, and reaped by sh - so the code would
 now refuse to use its pid for the killpg() call to send the SIGCONT.
 The (useless) loop would attempt to use less's pid for this purpose
 (it is still alive at this point) but that would fail, as that pid
 is not a process group identifier, of anything.   Hence the job
 could not be resumed.

 Before the PR (or preceding mailing list discussion) the change here
 had already been made (part of a much bigger set of changes, some of
 which might follow - sometime).   We now actually use the job's
 remembered process group identifier when we want the process group
 identifier, instead of trying to guess which pid it happens to be
 (which actually never took any guessing, it was, and is always the
 pid of the first process created for the job).   A couple of minor
 fixes to how the pgrp is obtained, and used, accompany the changes
 to use it when appropriate.

 To generate a diff of this commit:
 cvs rdiff -u -r1.116 -r1.117 src/bin/sh/jobs.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Prev by Date: re: port-arm/56596: Pinebook Pro graphics noticeably slower since DRM update
Next by Date: Re: port-arm/57030 (pinebook:Can't see audio interface.aiomixer do nothing.)
Previous by Thread: Re: port-arm/57030 (pinebook:Can't see audio interface.aiomixer do nothing.)
Next by Thread: Re: port-arm/57031 (pinebook pro: on X, Can't move mouse cursor,keeping left side of screen.)
Indexes:

Home | Main Index | Thread Index | Old Index