NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: standards/51603: WIFCONTINUED()==true always implies WIFSTOPPED()==true in the current implementation



The following reply was made to PR standards/51603; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: standards/51603: WIFCONTINUED()==true always implies WIFSTOPPED()==true in the current implementation
Date: Sun, 06 Nov 2016 06:52:35 +0700

     Date:        Sat,  5 Nov 2016 22:30:01 +0000 (UTC)
     From:        David Holland <dholland-bugs%netbsd.org@localhost>
     Message-ID:  <20161105223001.6675D7A285%mollari.NetBSD.org@localhost>
 
   |  It seems that when WIFCONTINUED was added it was meant to be indicated
   |  by having the whole value equal to 0xffff. In the above encoding this
   |  value indicates STOPPED on signal 255 with a core dump. This isn't
   |  what we want.
 
 No, but it is also what linux does, and if handled correctly (the values
 are only supposed to be tested using the WIFxxx() macros, not by manually
 decoding the bits) it is possible to make it work that way.
 
   |  ISTM that the best way to fix this is to add this rule after the rule
   |  for STOPPED:
   |  
   |    - if A is 0x7e, it means CONTINUED.
 
 That was my original suggestion (Kamil, Christos, and I have been discussing
 this problem for the past few days - off list).
 
   |  Naive code will then interpret the new WIFSTOPPED as SIGNALED with
   |  signal 126 but naive code that didn't ask for them shouldn't be
   |  getting WIFSTOPPED notices. With this change one won't get the wrong
   |  answer by testing WIFSIGNALED before WIFSTOPPED, and that seems like
   |  the important part.
 
 s/WIFSTOPPED/WIFCONTINUED/ in that paragraph for it to make any sense...
 
 And yes, this way would work.    Unfortunately, it also breaks existing
 code (apps) that have been compiled opn -current in the past 6 months, and
 which use WCONTINUED and WIFCONTINUED, as their tests for WIFCONTINUED
 would never succeed after the fix.
 
 I have no idea how much code there is like that - (shells/bash) is one.
 
 There is no code in the NetBSD source tree that tests WIFCONTINUED()
 so the problem applies only to external code (pkgsrc and other.) 
 
 There is, interestingly, some in-tree NetBSD side that sets WCONTINUED on
 a wait call - I think based upon the "if the flag exists, we should use it".
 The one example I am thinking of actually wants
 	waitpid(pid, &status, 0);
 but does
 	waitpid(pid, &status, WUNTRACED|WCONTINUED);
 and then proceeds to ignore any processes that are returned that are
 not WIFEXITED() or WIFSIGNALLED().   Bizarre.   [Aside: this is
 externally imported code, not written by the NetBSD project, there is
 no project code at all that references the CONTINUED stuff at all,
 if we ignore the ATF tests which explicitly test it, and the kernel which
 sets it, of course.)
 
 
   |  However, if that's not good enough we need to use more bits in the
   |  upper half of the word, because every possible combination of the
   |  lower bits already means something.
 
 Actually, I think 0x80 doesn't - exited normally with a core dump is
 a fairly insane combination, and 0xFF doesn't either, stopped with a
 code dump is just as meaningless (we could change the test for WIFSTOPPED
 to check that the low 8 bits are 0177 rather than just the low 7 bits as
 has been traditionally done.)
 
   |  A second way is to make another field D that's the next 4 or 8 bits,
   |  and use that to indicate what happened and then also set the other
   |  fields according to the traditional encoding, for compatibility.
 
 Anything like this also needs to consider what happens with waitid()
 which has a 32 bit exit code.
 
   |  Because POSIX insists that the exit status be truncated to 8 bits,
 
 except for waitid()  (that is actually a bit less clear, but there are
 significant arguments in the posix community that that is what was intended
 when the exit status field was made 32 bits in the struct that waitid()
 returns.)
 
 It is possible to make a fix for this by just altering the WIFxxx() macros
 (WIFSTOPPED can just add "&& !WIFcONTINUED(...)" for example, which
 guarantees that they can't both be true together.   That satisfies posix.
 
 It also has the advantage that existing compiled code keeps on working
 as well as it now does, which any change to the _WCONTINUED value does not.
 Since we have had zero bugs reports about code not working, and since linux
 also uses 0xFFFF as the _WCONTINUED value (FreeBSD does not, their
 WIFCONTINUED() is "stopped with a SIGCONT" (and that case is excluded from
 WIFSIGNALLED().)
 
 kre
 


Home | Main Index | Thread Index | Old Index