On 23.08.2017 16:24, Maxime Villard wrote: > Here is a patch that implements SMAP on amd64. SMAP is basically a feature > that prevents the kernel from reading a userland page, and it's a great > exploit mitigation feature. > > To function, it relies on two bits: CR4_SMAP in %cr4 and PSL_AC in %rflags. > When AC is cleared, any access to a userland page will generate a page > fault, > which is caught as fatal. When AC is set, such an access will succeed > without > fault. The logic is that when the kernel wants to touch a userland page (in > copyin for example), it needs to set AC, and then clear it once it's done. > > Userland can set/clear AC as it wishes, because in usermode AC stands for > the Alignment Check, which has nothing to do with smap. The main > implication > of this design is that PSL_AC needs to be saved/restored when > entering/leaving > the kernel; it becomes part of the kernel context. > > The patch works as follows: > * two functions are added, smap_enable and smap_disable. The former clears > PSL_AC, the latter sets it. > * these two functions use the clac and stac instructions, which do not > exist on CPUs that don't support smap. Therefore, a ret+int3+int3 opcode > is crafted, and it is hot-patched at boot time if smap is supported. > * smap_enable is called from INTRENTRY. Therefore, whenever we enter the > kernel, we cannot access a userland page - which is the point here. > * when leaving the kernel %rflags is already restored entirely, whether it > is in sysretq or iretq. So the AC bit in the previous context is put > back > properly. > * in the copy* functions, smap_disable and smap_enable are called to open > a window where the kernel can touch userland pages. Such a window looks > like: > callq smap_disable > /* touch userland page */ > callq smap_enable > if an interrupt or exception is received in this window, a new > context is > pushed by INTRENTRY, and it won't have PSL_AC set. The trap handler will > return into the original context but will jump in a recover function. In > this recover function, we are back with PSL_AC set, and call smap_enable > to clear it and return an error. > > This way, PSL_AC is set exclusively in the copy windows. On CPUs that don't > support SMAP, smap_enable and smap_disable return directly. The performance > cost in this case is a call+ret, so one write, one read and potentially an > icache line load. > > Notes: > * there are a few places where smap_* is called twice. This could be > optimized, but the patch is kept simple for now. > * on Xen, smap (and smep) are not enabled, because they are used by the > hypervisor to protect itself from dom kernels (us). > * i386 requires a little more work, so I'm not adding smap there yet. > > I've tested this patch mostly on Qemu - my most recent CPU only has smep. > Feel free to test it too, and I'll commit it in a few days, perhaps with a > few modifications. > > Maxime > > [1] http://m00nbsd.net/garbage/smap/amd64.diff For the reference, there was an unfinished work by Mateusz Kocielski (shm): http://netbsd.org/~shm/smap.diff http://wiki.netbsd.org/projects/project/x86_smap_smep/
Attachment:
signature.asc
Description: OpenPGP digital signature