tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: amd64: smap support

On 23.08.2017 16:24, Maxime Villard wrote:
> Here is a patch that implements SMAP on amd64. SMAP is basically a feature
> that prevents the kernel from reading a userland page, and it's a great
> exploit mitigation feature.
> To function, it relies on two bits: CR4_SMAP in %cr4 and PSL_AC in %rflags.
> When AC is cleared, any access to a userland page will generate a page
> fault,
> which is caught as fatal. When AC is set, such an access will succeed
> without
> fault. The logic is that when the kernel wants to touch a userland page (in
> copyin for example), it needs to set AC, and then clear it once it's done.
> Userland can set/clear AC as it wishes, because in usermode AC stands for
> the Alignment Check, which has nothing to do with smap. The main
> implication
> of this design is that PSL_AC needs to be saved/restored when
> entering/leaving
> the kernel; it becomes part of the kernel context.
> The patch works as follows:
>  * two functions are added, smap_enable and smap_disable. The former clears
>    PSL_AC, the latter sets it.
>  * these two functions use the clac and stac instructions, which do not
>    exist on CPUs that don't support smap. Therefore, a ret+int3+int3 opcode
>    is crafted, and it is hot-patched at boot time if smap is supported.
>  * smap_enable is called from INTRENTRY. Therefore, whenever we enter the
>    kernel, we cannot access a userland page - which is the point here.
>  * when leaving the kernel %rflags is already restored entirely, whether it
>    is in sysretq or iretq. So the AC bit in the previous context is put
> back
>    properly.
>  * in the copy* functions, smap_disable and smap_enable are called to open
>    a window where the kernel can touch userland pages. Such a window looks
>    like:
>        callq   smap_disable
>        /* touch userland page */
>        callq   smap_enable
>    if an interrupt or exception is received in this window, a new
> context is
>    pushed by INTRENTRY, and it won't have PSL_AC set. The trap handler will
>    return into the original context but will jump in a recover function. In
>    this recover function, we are back with PSL_AC set, and call smap_enable
>    to clear it and return an error.
> This way, PSL_AC is set exclusively in the copy windows. On CPUs that don't
> support SMAP, smap_enable and smap_disable return directly. The performance
> cost in this case is a call+ret, so one write, one read and potentially an
> icache line load.
> Notes:
>  * there are a few places where smap_* is called twice. This could be
>    optimized, but the patch is kept simple for now.
>  * on Xen, smap (and smep) are not enabled, because they are used by the
>    hypervisor to protect itself from dom kernels (us).
>  * i386 requires a little more work, so I'm not adding smap there yet.
> I've tested this patch mostly on Qemu - my most recent CPU only has smep.
> Feel free to test it too, and I'll commit it in a few days, perhaps with a
> few modifications.
> Maxime
> [1]

For the reference, there was an unfinished work by Mateusz Kocielski (shm):

Attachment: signature.asc
Description: OpenPGP digital signature

Home | Main Index | Thread Index | Old Index