Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src or a tale on NetBSD/usermode



On 22.12.2011 13:32, Reinoud Zandijk wrote:
> I understand your confusion on this point. Its due to the way NetBSD/usermode
> is build and why it is build that way. The main goals/features, for me at
> least, and even though some were formulated allong the way, come back to:
> 
> - it should behave like a separate (though virtual) machine.
> - there should be no difference between operating and developing in a
>   NetBSD/usermode and a normal NetBSD kernel as much as possible.
> - it should be usable for kernel development for as many subsystems as
>   possible.
> - it should be portable to, or just run on, every POSIX machine.
>[snip]
> After the memory has been set up it then attaches devices, like a
virtual cpu
> and a ld(4) driver for a disk image. After the attachments, NetBSD/usermode
> loads and starts init(8) from *within* its own memory space.

Okay, from a design decision this is questionable. If the intent is to
behave like a separate virtual machine and be as consistent as possible
when comparing usermode to native kernel, the usermode kernel ought to
run in its own address space:

- catching userland address deref, missing/incomplete copyin/copyout is
easier (unless you get all your mprotect(2)/mmap(2) calls right)

- kernel being not mapped in userland address space, there is no risk of
having kernel readable memory (unless, again, you get your mprotect(2)
calls right...)

- there are preliminary work to this; Antti's rumphijack, and IIRC,
latest versions of UML have separate userland/kernel spaces. Perhaps
vkernel from Dragonfly too. Dunno if these are really MI though.

Please note that adding non-POSIX flags to mmap(2) makes the whole thing
less portable. If usermode expects to run properly in any POSIX
environment, you have to remain largely POSIX compliant. The
MMAP_NOSYSCALLS part is not.

> Externalizing the userland processes would not only violate some of the goals
> but would also create a potential logistical nightmare. This would also create
> a distributed system rather than a NetBSD usermode kernel. A whole new project
> that would be fun to do, but out of scope. It could include process migration
> between machines, network transport, caching and proxies etc. etc.

How so? Having separate kernel/userland spaces has prior art. See above.

> To manage running native binaries, it needed help from the kernel and thus
> this patch arose. With it regions of memory could be designated as
> `not-for-systemcalls'. It could be that argued that a single virtual memory
> range setting function for this purpose could be used but that would make it a
> very tailored solution and not the general purpose one it is now.

It needed help from what kernel?

Your patch + explanations makes me think that the NOSYSCALLS regions
have to be set by the usermode kernel. Which means that NetBSD/usermode
relies on a mechanism not offered by other POSIX kernels out there;
there's approx. 0 chance that this will ever be accepted on other
operating systems.

>>> On the enhancing security argument, malicious source code could trigger
>>> compiler bugs that allow for code to be modified or otherwise manipulated
>>> to issue system calls where they shouldn't. Although it wouldn't nessiarily
>>> pose a system security issue, it could be used for extracting info or for
>>> malicious behaviour where with the patch it would simply bomb out.
>>
>> That's the part I have trouble with. It looks like a weaker form of W^X (or
>> PaX's mprotect), and I can't see the "additional" security benefits.
> 
> I've looked into that too, well mprotect() in particular. Even though the
> manpage tells it can explicitly allow for execution, lots of pmap
> implementations warn that their architectures can't distinguish between
> reading and executing permissions since their memory management modules simply
> don't distinguish between the two. More importantly the code, DOES need to
> execute in the mappings only system calls are to be prohibited. Elaborate
> single-stepping and/or code analying and replacing to find those instructions
> could be used but at what costs? Code might be interrupted with some
> constants, code might not start at byte 0, etc. etc. Heuristics are then to be
> used at best.

Again, I do believe that a correct approach would be separate address
spaces + per-process "raise SIGILL on syscall" (ptrace maybe?), instead
of implementing non portable logic inside NetBSD kernel.

I know that there are multiple ports that have separate address space
between kernel and userland, but rely on MD machinery to work properly
(amd64 Xen port being one of them).

>> Malicious code is free to trigger compiler bugs that can make calls to valid
>> memory areas. If you manage to plant a "int 0x80" in a MMAP_NOSYSCALLS
>> executable region, just make it to a "call __syscall".  At the expense of a
>> few more arguments, you will get the same result.
> 
> It depends on the implementation. Do you f.e. allow the linkage of this code
> to functions outside a designated list, or outside a designated area? If it
> manages to find __syscall by itself in its host program and patch up a direct
> call to that constant then yes it could call the OS. Static linking and
> strip(1) is your friend then.
>
> In NetBSD/usermode it would then still only be
> able to call the NetBSD/usermode kernel and not the host kernel.

I think that this needs clarification. Please correct my mistakes:
- NetBSD/usermode kernel is started as a userland process, and uses
POSIX API to setup its environment.
- it then proceeds to setting the userland memory regions with
MMAP_NOSYSCALLS flags, so userland cannot make direct syscalls to host
kernel
- passes execution to init(1) and userland
- all userland code making direct syscalls, this raises a SIGILL each
time which gives a chance to the usermode kernel to handle the userland
syscalls.

Right? So how can you implement the MMAP_NOSYSCALLS step on other POSIXy
systems?

Merry Christmas to you (and everyone else too) :)

-- 
Jean-Yves Migeon
jeanyves.migeon%free.fr@localhost


Home | Main Index | Thread Index | Old Index