tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/tests/net/icmp

[trimmed cc list a bit]

On Sun Jul 11 2010 at 20:52:32 +0200, Jean-Yves Migeon wrote:
> On 11.07.2010 17:46, Antti Kantee wrote:
> > "perform"?  Are you using that term for execution speed, or was it
> > accidentally bundled with the rest of the paragraph?
> execution speed (could be incorrect wording, I am no native speaker)

Why are you worried about execution speed?  My hypothesis is that it's not
going to be slower and without benchmarks to show otherwise I'd not worry
about it.  The main difference is that instead of switching to dom0, you
switch to a part of dom0.

*If* there is need for interaction between the different partitions
of dom0, it should be done by replication instead of a ping-pong mesh
of requests.  Yes, everything should be implementated as distributed
by default as I've talked about e.g. in my AsiaBSDCon and EuroBSDCon
2008 presentations.

> >> I think he was referring to using a rump kernel as a "syscall proxy
> >> server" rather than having in-kernel virtualization like jails/zones.
> >>
> >> That would make sense, you already have proxy-like feature with rump.
> > 
> > I'm not so sure.  That would require a lot of "kernel help" to make
> > everything work correctly.
> What kernel? rump or "host"?


> Per see, most of the syscalls would go to the proxy, only "privileged"
> operations like memory allocations/device multiplexing would need
> special handling by the host kernel.

It's not that simple if you want to run largely unmodified kernel code
against arbitrary processes.  Consider e.g. code that wants to call some
pmap() operation.  You're so deep into execution that the call to pmap
itself does not convey any semantic information and you can't remedy
the situation by doing a host kernel request from the syscall server.

Luckily the server doesn't have to exist at the syscall layer, but the
subsystem layer.  We already have examples like puffs and pud.  They
"workaround" the above problem by using a better-defined semantic layer.
Even so, there are some little things that don't quite work correctly.
For example, rump_nfs does not properly do the "modified" thing that the
nfs client does by pmap_protecting pages and marking them as modified in
the fault handler.  Yea, fixing that has been on my todo-list for quite a
while, but nobody has complained and there are more pressing matters ....
(I had support for something like that in puffs in 2006, but since there
were no users back then it kinda bitrotted away).

Anyway, the solution as usual is to work the problem from both ends
(improve the server methods and the kernel drivers) and perform a
meet-in-the-middle attack at the sweet spot where nothing is lost and
everything is gained.  The cool thing about working on NetBSD is that
we can actually do these things properly instead of bolting some hacks
on top of a black-magic-box we're not allowed to touch.

> > The first example is mmap: you run into it
> > pretty fast when you start work on a syscall server ;)
> > 
> > That's not to say there is not synergy.  For example, a jail networking
> > stack virtualized this way would avoid having to go over all the code, and
> > "reboot" would be as simple as kill $serverpid.  Plus, more obviously,
> > it would not require every jail to share the same code, i.e. you can
> > have text optimized in various ways for various applications.
> You also gain the advantage about resource control, as the proxy kernel
> is, by itself, a process. Buggy kernel code would only crash the server
> also, without putting too much of the host kernel at risk.
> However, this design is very close to the one I envisioned with Xen and
> "multiple small dom0's": with Xen, you may consider the "proxy server"
> as the domU kernel, and the application running within the domain is the
> jailed one. The difference being that the containers are handled just
> like any other process, whereas for Xen, they are domains.

Although I'm not familiar with the Xen hypercall interface, I assume it
to be infinitely more well-defined than unix process<->kernel interaction
with no funny bits like fiddling about here and there just because the
kernel can get away with it.

> The jails/containers approach is more lightweight, you just have one
> instance of the kernel; IMHO, they could be compared to chroot, with
> many, many improvements. Each solution has its advantages/inconvenients.

Is it now?  In both cases you can have 1 copy of the kernel text (ok,
O(1) copies) and n copies of the *relevant* data (ok, and the process
overhead).  For non-academic measurements where's you're interested in
application scenarios instead of pathologic microbenchmarks, I'd expect
there to be ~0 difference in "lightweightedness".

Anyway, they're completely different and I don't see the point of
comparing them.  I was just trying to point out one possible strategy
for *implementing* the necessary support bits for jails/zones.

Home | Main Index | Thread Index | Old Index