Subject: Re: SA_SIGINFO notes
To: None <M.Drochner@fz-juelich.de>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 10/06/2003 21:54:30
On Monday, October 6, 2003, at 02:55  AM, Matthias Drochner wrote:

>
> pk@cs.few.eur.nl said:
>> 2. In kern_sig.c:kpsendsig(), the `ksi_trap' is used to decide which
>>    arguments to pass to sa_upcall(). I don't what it is used for,
>>    but blindly using `ksi_trap' without further inspection of the
>>    siginfo context seems like a bad idea.
>
> Yes, I had this impression too a while ago. I'm running a fixed system
> (patch appended) for weeks now without problems. (I believe I've sent 
> the
> patch somewhere for review, but it got lost somehow.)
> I didn't see problems with asynchronous signals initially, but also
> for traps there is the possibility that the md T_XXX is zero
> (eg T_PRIVINFLT on i386). I'll append a test program which tries
> this. On an unpatched system, the libpthread signal delivery code gets
> confused by it -- the signal gets delivered twice before the handler
> is called, as one can see with ktrace.

I don't think this patch is sufficient, either.

Even ksi_code has to be interpreted in the context of which signal has 
been generated.  In the old code, "code" being non-zero meant that it 
was a signal caused by a trap.  In the new code, we don't have an easy 
way to determine this (note the new ksi_code is not the same as the old 
"code" .. the old "code" is more like ksi_trap, which is presumably why 
Christos wrote the new code the way he did, but it has the problems 
that Paul points out, since it shares storage with other siginfo 
fields).

Maybe what we need here is a flag in the ksiginfo_t that says "hey, 
this is the result of a trap".  Otherwise, we'd have to make a decision 
based on the *signal*, which isn't even right, since you could e.g. 
pass SIGSEGV to kill(2).

There's another use of ksi_code that I think is sketchy.  Take a look 
at kern_ktrace.c:ktrpsig().  It uses ksi_code > 0 to determine if 
ksi_trap is valid, but it has the same problem I mention above.  (We 
need a new ktrace record type that spits out siginfo.)

Actually, a great example of how using ksi_code in this way falls over 
is the case of a SIGCHLD being generated because a child exits.  
ksi_code will be set to CLD_EXITED (1), which will cause ktrpsig() to 
use ksi_trap, but ksi_trap isn't actually valid, and would overlap with 
the uid of the exiting process, not the pid.  In any case, 
historically, the "code" seen in the ktrace record would be 0 in this 
case, since SIGCHLD isn't generated as the result of a trap.  (I think 
part of the problem is that we're getting hung up on similar names for 
things that have very different semantics.)  Matthias's patch would 
have a similar problem.

         -- Jason R. Thorpe <thorpej@wasabisystems.com>