Subject: Re: non-exec stack problems with multithreaded programs
To: Chuck Silvers <chuq@chuq.com>
From: Matthias Drochner <M.Drochner@fz-juelich.de>
List: port-i386
Date: 12/08/2003 16:48:58
chuq@chuq.com said:
> I'm a little fuzzy on the *DT stuff. all the other segment registers
> are set up using the GDT, why would CS be different? 

Currently, on exec(), segments from the LDT are used to initialize
the registers for the new process, see machdep.c:setregs().
For signal delivery/upcalls, GDT registers are used (see machdep.c:
buildcontext()/cpu_upcall()). That's the inconsistency I was
referring to.

> > -pmap_exec_fixup() will never revoke anything, there is dead code
> I'm not sure what you mean by this, the current code will reset CS
> to the non-exec-stack version (GUCODE_SEL) if it can.

pmap_exec_fixup() is only called by the trap handler. This means,
only in the case where the code segment is too small for the
running process. The trap just does not happen in the opposite case.
As I see it, revocation only happens through the ...account()
function after pmap modifications. (and then only for the thread
executing it)

> isn't the CS in the TSS used?

Afaict, no. There are some fields in the TSS used to save context
information, but this is always done manually. The interrupt
stack is taken from the TSS as defined by intel, the rest is
arbitrary, more or less. (I might miss some details here...)
(The doublefault exception handler, and IO permission bitmaps
are some exceptions.)
Anyway, even if the CS was reloaded from the TSS on each return
from userland, switching just the CS to enforce the stack permissions
would be not perfect. It would make that the CS is updated
on secondary processors too eventually, agreed. But the
"big" CS is still easily available (the LDT and/or the GDT one),
it can get in action after revocation intentionally (just
loading the well-known segment index) or unintentionally (doing
some *jongjmp() thing).

What I'm proposing is to use only LDT registers for user segments.
Then have two LDTs -- one with the small CS, one with the large
one, both on the same position, and switch LDTs in the TSS as
appropriate. LDTs are reloaded (manually) in pmap_activate(), so
secondary CPUs will pick up the new one.
Emulations might need some consideration (might require a CS
from the GDT strictly), and the interaction with USER_LDT
might get a bit complex. Otherwise, it should be straightforward.

best regards
Matthias