Subject: changes for hppa pthreads
To: None <firstname.lastname@example.org>
From: Chuck Silvers <email@example.com>
Date: 07/14/2004 09:26:14
we need a few changes to the MI libpthread and kernel code to allow for
a number of unique properties of the PA hardware and ABI:
this convention is where (like on AIX) C function pointers are indirect.
the actual value put into a register is a pointer to 2 values:
the real code address and a pointer to per-shared-object global data.
this shows up in pthread__resolve_locks(), where we are comparing
a saved PC value to a particular function pointer. the current code
does this with an integer comparison, but for PA we need it to be
a comparison of function pointers. gcc emits calls to a millicode
function called "__canonicalize_funcptr_for_compare" which deals with
either of the function pointers being PLABELs.
(2) stack grows up
on all other platforms the stack grows toward smaller addresses,
on the PA the stack grows toward larger addresses. I added a hook
to expose the kernel STACK_* macros to userland so I could use these
in the libpthread functions that muck with stacks.
(3) spinlock values are reversed (0 is locked, non-zero is unlocked)
the atomic instruction on the PA puts a 0 into a memory address
while returning the old value. this means that an uninitialized
spinlock is held instead of free, so we must be sure that *all* locks
are initialized before being used (there were several global locks
in libpthread that were not initialized).
(4) spinlocks must be 16-byte aligned
the atomic instruction on the PA additionally only works when the
address it's used on is on a 16-byte boundary. I haven't done
anything about this yet, since RAS is sufficient for the moment.
the current definition of __cpu_simple_lock_t for PA tries to
use the GCC "aligned" attribute in a typedef, but that doesn't
actually have any effect. to fix this, we'll eventually need to
define this as an array of 4 ints (or a structure containing such
an array) and then just use the element that's on the 16-byte boundary.
I was thinking this would require adding some more macros like
and using those instead of having assumptions in MI code that
__cpu_simple_lock_t is an integral type. but that'll come later.
I've put the changes in ftp://ftp.netbsd.org/pub/NetBSD/misc/chs/hp700/ :
extract the tar file and apply both patches and it should build.
the diffs still contain a little debug code (which will be removed)
and some other fixes for floating-point on PA-7300LC CPUs, which will
be split out and committed separately.
one thing I'd like to get more input on is what should go into
struct mcontext. right now I've got it as:
31 general registers (r0 is always 0)
32 floating-point registers
PSW (process status word)
SAR (shift amount register, aka cr11)
pcsqh, pcsqt, pcoqh, pcoqt (PC regs: space and offset, current and next)
sr0 to sr4
cr26 and cr27 (cr27 is the ABI's thread-local-storage register,
cr26 is also visible from user code but I don't know
if there's any convention for its use)
currently we ignore attempts to change the space registers (which works for
now since we only give each process permission to access its one space anyway),
but it seems good to put these in here anyway in case we want to support
multiple spaces per process someday.
is there anything that should be added to or removed from mcontext?
other comments? questions? complaints? if there are no objections
I'll commit this stuff this weekend.