Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Lightweight virtualization - the rump approach



On Sat May 15 2010 at 22:41:29 +0200, Jean-Yves Migeon wrote:
> Let's take an example; suppose that $somefs_support is integrated in 
> NetBSD (like ZFS, NILFS, ...), what kind of additional work is needed to 
> go from mount_somefs to rump_somefs (thinking about all _KERNEL 
> functions that are missing).

Well, first of all you're falling into the microkernel two-faceted trap
again with your example ;)

rump_somefs is the *userspace* part.  Actually, all the rump_somefs are
somewhat misnamed.  I think p2k_somefs would be a better name, but it's
not really worth changing anymore.

Anyway, for _KERNEL in the best case you'll have to do absolutely zero
work, since you can use the kernel module (on x86).  In the worst case
you're on non-x86 and need to build a separate rump lib (i.e. create
Makefile) and add some functionality.  However, it's getting more and more
unlikely that you'll run into any missing functionality.  Pretty much
everything from _KERNEL that I want supported in rump, apart from some
code dealing with e.g. struct proc, is there.

IOW, I don't expect any problems.  But if you run into some, feel free
to contact me on or off-list.

The other part is of course the rump_somefs userland utility.
Rather than me explaining it here poorly, I encourage to look at the
code in src/usr.sbin/puffs.  They're really short (unless you look at
lfs in which case it's like 50 lines).  When the "generic fs mounting"
project that Arnaud did for his gsoc last year is integrated, this will
be even further simplified.

> ZFS is known for being invasive, and reproduce functionalities you could 
> find in other layers, like raid(4). Do such things affect rump?

I haven't looked at the last word (*), but I can't imagine it being
fundamentally different than for example lfs, smbfs or puffs, which all
are supported.  (yes, that's not a typo.  you can run puffs in userspace
on top of puffs with a puffs (or fuse) server running on top of that.
now top that).

But, going on a slight tangent, raid (and cgd etc.) is an interesting
case, since configuring it requires a somewhat complex utility which
alters the state of the kernel.  So, if you have a very shortlived and
simple kernel, you still need to do the same state-alteration if you
want it to access data behind e.g. raid or cgd.  I've been thinking for a
while about adding something like rc(8) support to rump, but haven't done
anything about it so far.  And by "for a while" I mean about a year now,
since that's approximately when I added raidframe support.

And flying off even further on the tangent's tangent (which, I guess,
is just a tangent at least until the mathematicians prove me wrong), in
some cases the separation of state between the host and the rump kernel
requires a bit of trickery.  A good example is mounting a file system
from within a disk image.  Normally you would specify the partition by
the appropriate file name, e.g. wd0*f*.  But if all you have on the host
is disk.img, you can't do that.  You could vnconfig on the host, but
that sucks for two reasons: 1) requires privileges 2) does not support
sparse files.  So there is magic path support, e.g. disk.img%DISKLABEL:f%
refers to the "f" disklabel partition on the image and giving that to
rump_somefs as the path will mount the f partition.  Btw, due to "2"
rump is still the only sensible way of writing file systems on sparse
files (unless someone considers giving the image to qemu, booting a whole
system, mounting the fs there, and then using things like scp for file
transfer sensible).

I guess I ran out of tangents right about here.  Anyway, the point I was
coming to is that maybe zfs with it's self-contained all-in-one approach
is marginally better suited for rump than other models.  But I don't
expect any practical difference in the end either way.

> >Well, maybe.  That does make some sense if you already are capable
> >of hosting domU's.  I never found Xen very convenient for my use case
> >(occasional testing) because it needs a special dom0 kernel.  Why isn't
> >that stuff in GENERIC again?  IIRC there were some device support issues
> >years ago, but do they still remain?
> 
> At least for x86, key parts of MD code are handled differently, 
> especially pmap(9), locore's boot stuff, and some bus_space code. So you 
> have #ifdef/inline thingies, which are, as you have noticed, not that 
> well modular-friendly.
> 
> Making this MD part dynamic would need some clean up in x86+xen, and 
> probably introduce function pointers to make the thing more dynamic and 
> "stable", like Linux and the big paravirt_ops.
> 
> dom0 vs domU is another story, it is basically a reduced version of 
> dom0, cleaned from all drivers and code that are not required for a 
> domU. May become unnecessary when kernel becomes more and more modular:
> -rwxr-xr-x  1 root  wheel   11M Apr 28 09:21 netbsd_XEN3PAE_DOM0
> -rwxr-xr-x  1 root  wheel  4,1M Apr 28 09:21 netbsd_XEN3PAE_DOMU

I don't care about domU differences that much (in my use case), it's the
dom0 which is the showstopper.  Frankly, I'd love to be able to run my
anita testing with xen instead of qemu, since unaccelerated qemu tends
to be quite slow (and my laptop doesn't support VT-x anyway).

  - antti

*) unless we're talking about the shaken cocktail with equal parts gin,
maraschino, chartreuse and lime juice.


Home | Main Index | Thread Index | Old Index