tech-kern: Re: Return addresses from trap signals

Subject: Re: Return addresses from trap signals
To: None <bjh21@netbsd.org, tech-kern@netbsd.org>
From: Ross Harvey <ross@ghs.com>
List: tech-kern
Date: 03/05/2001 17:18:37
> From tech-kern-owner@netbsd.org Mon Mar  5 16:35:04 2001
> Date: Tue, 6 Mar 2001 00:33:21 +0000 (GMT)
> From: Ben Harris <bjh21@netbsd.org>
> To: tech-kern@netbsd.org
> Subject: Return addresses from trap signals
>
> What should happen if a handler for a trap signal (SIGILL, SIGSYS,
> SIGSEGV, SIGTRAP) returns, or if the signal's ignored?  Should the failed
> instruction be retried, or should the machine skip on to the next
> instruction?  Does this depend on which signal it is?  I'm trying to work
> out what the right value to put in trapframes on ARM systems is, and
> following what a sigcontext will need might be as good an approach as any.
>
> -- 
> Ben Harris                                                   <bjh21@netbsd.org>
> Portmaster, NetBSD/arm26               <URL:http://www.netbsd.org/Ports/arm26/>
>
>

If the problem hasn't been fixed and it didn't longjmp(3) over it, it is
acceptable for you to let the program continue in an infinite loop of
handler/retry/handler/retry. You don't have to make broken programs work,
and you can't anyway.

You certaily want to provide debuggers and fault handlers with accurate
information, so it's generally wise to back up to the faulting instruction
in the frame. This is especially true if the instruction is in fact
restartable; perhaps the fault handler will actually fix the problem.
(Imagine a read-only mmap region being remapped with a higher cost in
locking if it gets written.)

Now, if registers or memory that could be source ops have been modified,
and the instruction has "completed", but simply raised a signal, then it's
wrong to back up the PC unless you care to undo its effects.

Divide your cases into

	faults:		the instruction did not complete and did not
			modify any source operands, so it is restartable
			At least in theory, this kind of event is
			reported on the boundary before the faulting
			instruction.

	traps:		the instruction finished but raised a signal
			like overflow. At least in theory, this kind of
			event is reported on the boundary after the
			instruction.

	aborts:		you may not know exactly where the bad things
			happened and restart is not possible

So, for things that are faults, the PC should already be at the boundary
before the instruction, but if it isn't you need to fix it.  For traps, It
Depends.  For aborts, it doesn't matter.

Some instructions like breakpoint are occasionally traps in HW, but they
modify nothing, so it's actually your choice. In this case, if you (or the
HW) haven't already backed up the PC, then the debugger has to, so it will
save you time and effort to just do what the debugger is already expecting
you to do.

Believe it or not, these are the easy cases.  There are vastly uglier ones
out there. Besides the earlier m68k barf-up-all-internal-state (gross beyond
belief) traps, motorola also gave the world the 88k, where magic registers
told the trap handler what they needed to know to finish the trapped
instructions and store their results to memory. (What _do_ you tell a user
mode handler?) And then there was the Intel 860, where the trap handler
got to clean out work in progress inside the floating point pipeline, but
could only do so by putting more stuff into the pipeline, and then it needed
to reconstruct a compatible state on return. And then there was the 860's
trick of sometimes being in a mode where it executed paired instructions,
but it needed to take a running start at that mode... Where exactly did it
stop?

Alpha, now, throws its hands up on certain floating point ops, and traps
*after a few more instructions* to a handler that needs to scan backwards
to find the an instruction that could have produced such an abort and then
execute it to IEEE standards in SW. There are rules for the compiler as to
what kind of floating point ops it can string together without barriers
when it wants IEEE conformance. That's what I'm working on now, with what's
left of my mind after those other CPU's.

	--Ross