tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
amd64: smap support
Here is a patch that implements SMAP on amd64. SMAP is basically a feature
that prevents the kernel from reading a userland page, and it's a great
exploit mitigation feature.
To function, it relies on two bits: CR4_SMAP in %cr4 and PSL_AC in %rflags.
When AC is cleared, any access to a userland page will generate a page fault,
which is caught as fatal. When AC is set, such an access will succeed without
fault. The logic is that when the kernel wants to touch a userland page (in
copyin for example), it needs to set AC, and then clear it once it's done.
Userland can set/clear AC as it wishes, because in usermode AC stands for
the Alignment Check, which has nothing to do with smap. The main implication
of this design is that PSL_AC needs to be saved/restored when entering/leaving
the kernel; it becomes part of the kernel context.
The patch works as follows:
* two functions are added, smap_enable and smap_disable. The former clears
PSL_AC, the latter sets it.
* these two functions use the clac and stac instructions, which do not
exist on CPUs that don't support smap. Therefore, a ret+int3+int3 opcode
is crafted, and it is hot-patched at boot time if smap is supported.
* smap_enable is called from INTRENTRY. Therefore, whenever we enter the
kernel, we cannot access a userland page - which is the point here.
* when leaving the kernel %rflags is already restored entirely, whether it
is in sysretq or iretq. So the AC bit in the previous context is put back
properly.
* in the copy* functions, smap_disable and smap_enable are called to open
a window where the kernel can touch userland pages. Such a window looks
like:
callq smap_disable
/* touch userland page */
callq smap_enable
if an interrupt or exception is received in this window, a new context is
pushed by INTRENTRY, and it won't have PSL_AC set. The trap handler will
return into the original context but will jump in a recover function. In
this recover function, we are back with PSL_AC set, and call smap_enable
to clear it and return an error.
This way, PSL_AC is set exclusively in the copy windows. On CPUs that don't
support SMAP, smap_enable and smap_disable return directly. The performance
cost in this case is a call+ret, so one write, one read and potentially an
icache line load.
Notes:
* there are a few places where smap_* is called twice. This could be
optimized, but the patch is kept simple for now.
* on Xen, smap (and smep) are not enabled, because they are used by the
hypervisor to protect itself from dom kernels (us).
* i386 requires a little more work, so I'm not adding smap there yet.
I've tested this patch mostly on Qemu - my most recent CPU only has smep.
Feel free to test it too, and I'll commit it in a few days, perhaps with a
few modifications.
Maxime
[1] http://m00nbsd.net/garbage/smap/amd64.diff
Home |
Main Index |
Thread Index |
Old Index