Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Saving %gs and %fs over interrupts and syscalls

Having fixed the i386 'trap during return to user' I looked at the
amd64 code - I shouldn't have!

First some background reading:

In 64bit mode, amd64 ignores the offset and size of selectors %ds, %es,
%ss and %cs. The offset from %fs and %gs is added into the address
calculations (the size is not used).

However, when any of the segment regsters are loaded, the full segment
descriptor is read (as for i386) and the instruction will trap/fault
if the segment number is invalid.
The descriptor contents will be used if the cpu returns to a 32bit app.

Since the descriptor format hasn't been changed, only a 32bit value
can be loaded from the %fs and %gs descriptors intio the FS.Base and
GS.Base registers - so the segment overrides can only add in a 32bit offset.

The full 64bit FS.Base/GS.Base are accessible as MSRs c0000100 and c0000101.
MSR c0000102 is KernelGSbase and can be swapped with GS.Base with the
'swapgs' instruction. This is used on system call entry (via SYSCALL
instruction) in order to get a register which can point to the kernel
stack address (etc).

The NetBSD kernel only ever saves the %fs and %gs segment registers.
It doesn't save either of the FS.Base or GS.Base registers that might
need to be set by userspace.  I don't think anything 'normal' in NetBSD
tries to set these values, but they are probably used by Linux for
thread specific data - and NetBSD will probably need to do something similar.

It is possible that things like the JVM are trying to use Linux syscalls
to set these values - the fact that NetBSD fails to save/restore them
may be relevant to the failure of the JVM in NetBSD amd64.

I think it is necessary to save and restore values of %fs, %gs, FS.Base
and GS.Base on system calls and interrupts.  This is rather problematical
but restoring FS.Base after %fs while ensuring that if the kernel
changes the %fs that a process would restore will also modify the saved
Fs.Base might work!

Alternatively perhaps FS.Base sould only be saved/restored when %fs is zero.
There may be some info in the Linux kernel or open solaris.

The next problem arises if/when the processor traps loading the userspace
registers (the code I fixed for i386).

Faults loading the segment registers are relatively easy to recover from.
The register frame containing all the user registers still exists - so
can be for the SIGSEGV handler and/or a reattempt of the return to user
after fixing something.

Faults on the 'iret' are seriously more problematical.
I have no reason to disbelieve that they are impossible, I'd certainly
expect is to be easy when returning to 32bit mode. 
(There is currently code that attempts to handle 'iret' trapping.)

Firstly, on any interrupt the kernel stacks %ss, %rsp, %rflags, %cs, %rip
(in that order). In kernel mode, %ss is always 0, so a trap from kernel
space will stack a 0 for %ss.
The netbsd interrupt code doesn't bother saving or restoring %ds, %es, %fs
or %gs when the saved %ss is zero - on the assumption that the current
values must be the kernel ones, and so don't need to be saved/restored.

So when we fault on the iret, two things go wrong:
1) The user values for the segment registers are saved (or later restored)
2) swapgs isn't used, but %gs (and GS.Base) will have the user values
   not the kernel ones.

I'm not sure sysret/sysexit can fault, but the prior segment register
restores can.  Especially if/when we support ldt - since a different
thread can invalidate the segment, and the current checks restoring
mcontext are all pointless.
The kernel stacks for sysenter and interrupts/traps are rather
different, but the trap handler code treats them the same - which
is probably wrong.


David Laight: david%l8s.co.uk@localhost

Home | Main Index | Thread Index | Old Index