Re: CVS commit: src or a tale on NetBSD/usermode

To: jean-Yves Migeon <jym%NetBSD.org@localhost>
Subject: Re: CVS commit: src or a tale on NetBSD/usermode
From: Reinoud Zandijk <reinoud%NetBSD.org@localhost>
Date: Thu, 22 Dec 2011 13:32:59 +0100

Hi Jean-Yves,

On Wed, Dec 21, 2011 at 07:55:45PM +0100, jean-Yves Migeon wrote:
> On Wed, 21 Dec 2011 16:47:49 +0100, Reinoud Zandijk wrote:
> >The patch is written to allow for multiple non-UVM flags to be attached to
> >mappings and allow the kernel to react on them. NetBSD/usermode uses this
> >to disallow system calls to be made from within mapped regions and get them
> >returned as illegal instructions so it can analyse and emulate the system
> >calls. To prevent every process to be scrutinized this way a process flag
> >has been introduced to mark if a process needs this check since the
> >detection involve acuiring a lock to walk the uvm map.
> 
> Why make this a memory-level property, and not a process-level property? If
> you want to proxy syscalls between host and usermode kernel, why make it
> exclusive to certain mem regions? I am probably missing something with the
> way usermode processes, usermode kernel host kernel interact.

I understand your confusion on this point. Its due to the way NetBSD/usermode
is build and why it is build that way. The main goals/features, for me at
least, and even though some were formulated allong the way, come back to:

- it should behave like a separate (though virtual) machine.
- there should be no difference between operating and developing in a
  NetBSD/usermode and a normal NetBSD kernel as much as possible.
- it should be usable for kernel development for as many subsystems as
  possible.
- it should be portable to, or just run on, every POSIX machine.

The NetBSD/usermode kernel is build to run like a normal program and behaves
like a normal program but is build just like a normal NetBSD port. On start
up, it sets up memory areas just like normal MD kernel code does and
initialises a pmap, the UVM and other stuff just like normal kernels do. It
uses mmap(2) to provide a as `real' as possible virtual memory system without
needing to know a thing about its target architecture or memory percularities
like pkgsrc's wine does with its user ldt's and thus only suitable for
i386/amd64. NetBSD/usermode should be able to run also on ARM, Sparc, PowerPC,
HPPA, SH4 etc.

After the memory has been set up it then attaches devices, like a virtual cpu
and a ld(4) driver for a disk image. After the attachments, NetBSD/usermode
loads and starts init(8) from *within* its own memory space.

At this point the confusion starts when the loader of init(8) starts to issue
system calls. Without intervention, those system calls are going to the host
os that runs the NetBSD/usermode kernel, resulting in all kinds of mayhem.

Externalizing the userland processes would not only violate some of the goals
but would also create a potential logistical nightmare. This would also create
a distributed system rather than a NetBSD usermode kernel. A whole new project
that would be fun to do, but out of scope. It could include process migration
between machines, network transport, caching and proxies etc. etc.

Internalizing the userland processes is closer to the goals. The main problem
with internal userland processes is determining where the system calls are
called: is it the NetBSD/usermode kernel itself or the userland process
running inside it.

To distinguish the two, we tried to use PTRACE to intercept them. This ptrace
solution turned out to be quite a hack and never worked since we stumbled on
lots of NetBSD bugs involving signams and the fact that ptrace was never
designed to be a gateway between the kernel and a userland process but more a
snooper.

A feasable solution turned out to have a tailored usermode userland.
Recognizing that the kernel is only called using two macro's in libc, i
patched the macros to not create the system call instructions but to generate
dedicated and detectable illegal instructions. The userland code would thus
not call the kernel but raise an SIGILL that the NetBSD/usermode kernel can
catch, detect and process like it got a system call from the userland.

So far, the usermode code could well run on every POSIX system (with some
porting of course), but could not run stock NetBSD binaries, only the tailored
ones.

To manage running native binaries, it needed help from the kernel and thus
this patch arose. With it regions of memory could be designated as
`not-for-systemcalls'. It could be that argued that a single virtual memory
range setting function for this purpose could be used but that would make it a
very tailored solution and not the general purpose one it is now.

> >On the enhancing security argument, malicious source code could trigger
> >compiler bugs that allow for code to be modified or otherwise manipulated
> >to issue system calls where they shouldn't. Although it wouldn't nessiarily
> >pose a system security issue, it could be used for extracting info or for
> >malicious behaviour where with the patch it would simply bomb out.
> 
> That's the part I have trouble with. It looks like a weaker form of W^X (or
> PaX's mprotect), and I can't see the "additional" security benefits.

I've looked into that too, well mprotect() in particular. Even though the
manpage tells it can explicitly allow for execution, lots of pmap
implementations warn that their architectures can't distinguish between
reading and executing permissions since their memory management modules simply
don't distinguish between the two. More importantly the code, DOES need to
execute in the mappings only system calls are to be prohibited. Elaborate
single-stepping and/or code analying and replacing to find those instructions
could be used but at what costs? Code might be interrupted with some
constants, code might not start at byte 0, etc. etc. Heuristics are then to be
used at best.

> Malicious code is free to trigger compiler bugs that can make calls to valid
> memory areas. If you manage to plant a "int 0x80" in a MMAP_NOSYSCALLS
> executable region, just make it to a "call __syscall".  At the expense of a
> few more arguments, you will get the same result.

It depends on the implementation. Do you f.e. allow the linkage of this code
to functions outside a designated list, or outside a designated area? If it
manages to find __syscall by itself in its host program and patch up a direct
call to that constant then yes it could call the OS. Static linking and
strip(1) is your friend then. In NetBSD/usermode it would then still only be
able to call the NetBSD/usermode kernel and not the host kernel.
> >Hope this answers most of your questions.
> 
> Waiting for mines :)

Hope this clarifies some :)

With regards,
Reinoud

Follow-Ups:
- Re: CVS commit: src or a tale on NetBSD/usermode
  - From: Jean-Yves Migeon

References:
- Re: CVS commit: src
  - From: Joerg Sonnenberger
- Re: CVS commit: src
  - From: Simon Burge
- Re: CVS commit: src
  - From: Reinoud Zandijk
- Re: CVS commit: src
  - From: jean-Yves Migeon

Prev by Date: Re: CVS commit: src
Next by Date: Re: CVS commit: src
Previous by Thread: Re: CVS commit: src
Next by Thread: Re: CVS commit: src or a tale on NetBSD/usermode
Indexes:

Home | Main Index | Thread Index | Old Index