Subject: Re: Code on stack (Re: exploit with memcpy())
To: None <tech-userlevel@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-userlevel
Date: 07/05/2002 04:29:52
>> The compiler emits code to sync the I-cache after the trampoline is
>> spit out onto the stack.  We could [change the code to mprotect()].
> ISTM that someones 'little trick' of generating an on-stack
> trampoline has got rather out of control!  The cost of the I-cache
> sync must surely overwhelm any instruction count benefit of the
> trampoline?

Trampolines aren't generated to gain an instruction count benefit.

As far as I can tell, stack trampolines are used for only two things at
present: (1) signal delivery and (2) gcc's nested function support.
(1) is being worked on; there's no particular reason that the signal
trampoline _has_ to be on the stack, and indeed it could be given a
dedicated page of its own that's RO and executable.  (2) is what the
comment you quoted was designed to address; compiler-generated code is
not responsible for generating signal delivery trampolines.  And
there's little choice there.  As far as I can tell, there are only two
ways to support nested functions in a language with function pointers:
with stack trampolines and by fattening function pointers.  gcc chose
to use stack trampolines; AIUI this was for the sake of compatability
with existing code on systems that don't use gcc exclusively.

Which, actually, brings up a point: has anyone looked at making gcc use
fat function pointers instead?

> Since code is required in libc, it might as well be the stack tidy
> code.

Um, what code would be required in libc?

It does occur to me that with a little kernel support, the stack
trampoline for nested functions could be reworked to not require an
exuectable stack: if a process takes a no-execute trap due to trying to
execute in the stack segment, and the word pointed to is some magic
value which is an instruction otherwise unusable by userland, have the
kernel do the stuff the trampoline needs.  It would mean a trap to the
kernel every time such a function was called, but such code would at
least basically work, and the stack could be completely non-executable.
And, it would mean comparatively minor changes to gcc.

Not that I'd want to use it, mind you; I'd personally rather have an
executable stack than bad performance from nested functions.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B