tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD/usermode (Was: CVS commit: src)



On Thu, Dec 22, 2011 at 08:22:17PM +0400, Valeriy E. Ushakov wrote:
> On Wed, Dec 21, 2011 at 16:47:49 +0100, Reinoud Zandijk wrote:
> 
> > From the beginning of the usermode project, we struggled with the
> > fact that system calls in usermode's userland will go to the wrong
> > kernel [...]
> 
> Because you chose to run userland code in the same process with the
> usermode kernel code, which is a rather controversial design choice,
> IMO.  E.g. how do you prevent userland code from accessing pages it's
> not meant to access, i.e. how do you emulate MMU?

I have wondered about this.  The approach Jared and Reinoud described to
me by chat about a week ago sounded like it would probably work, but I
was left wondering why a more explicit use of the Unix process model to
provide VM system contexts was not chosen -- it seems to me both safer
and simpler.

But there are some nasty corner cases.  Consider a seemingly simple
approach like:

        1) The usermode kernel is in its own process on the outer kernel.

        2) The usermode userland processes are in their own Unix processes
           on the outer kernel.

        3) The usermode userland processes run under a restrictive "emulation"
           on the outer kernel which prohibits all system calls.
           Producer/consumer rings in shared memory are used for system
           calls to the usermode kernel, much as Xen does (see below).

           This *forces* all I/O through the usermode kernel
           and prohibits all system calls to the outer kernel.

        4) When a new process is created on the usermode kernel, the usermode
           kernel process does this by forking a new process on the outer
           kernel, adjusting its memory mappings appropriately, 
           including MAP_SHARED space for system call rings.  Then it
           executes a new system call that irrevocably prohibits all system
           calls to the new process, and jumps to the usermode process's
           code.

This all seems simple and elegant enough, but it does not (quite) work:

        A) It still requires a new system call on the outer kernel.
           *Perhaps* this could be avoided by using ptrace, which might
           be simpler with this approach because the rule is simple: just
           say no to all system calls.

        B) There is no way for the usermode userspace process to allocate
           memory.  I don't really see a clean way to fix this:

                1) You can't just allow sbrk() and mmap() to the child
                   processes naively -- the usermode kernel needs to know
                   what they're doing!  This would imply callbacks of some
                   kind from the host kernel to the usermode kernel; not
                   so portable, to say the least.

                2) Using ptrace to allow, but validate, sbrk and mmap
                   arguments seems questionable at best.  How would this
                   interact with the NetBSD VM system in the usermode kernel?

                3) Shared memory between usermode processes (particularly
                   unrelated ones) seems difficult to handle, to say the
                   least.

                Perhaps 1-3 could be addressed by adding one system call to
                the outer kernel, to allow process A to manipulate memory
                mappings in process B.  Still, not portable by a long shot.

                4) How exactly does the usermode kernel _end_ the usermode
                   userspace processes in a clean way?

Anyway, that's just one possible alternate approach.  Working through it
makes me really wonder whether there's _any_ portable way to do this stuff.
But I wish we could have a public discussion about it before hacking up
public interfaces in NetBSD to support one particular nonportable way.

What does usermode Linux do?

Thor


Home | Main Index | Thread Index | Old Index