Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: call for testing: xen 4.1 packages



On 01.04.11 00:25, Thor Lancelot Simon wrote:
> On Fri, Apr 01, 2011 at 12:17:15AM +0200, Christoph Egger wrote:
>> On 01.04.11 00:12, Thor Lancelot Simon wrote:
>>> On Thu, Mar 31, 2011 at 11:16:39PM +0200, Manuel Bouyer wrote:
>>>>
>>>> As i understood it, backend drivers have moved to userland (in qemu-dm)
>>>> even for PV guests.
>>>
>>> Oof!  Doesn't this cause a quadruple context-switch for every I/O?
>>
>> I don't know. Can you show me some numbers?
> 
> I cannot show you measured performance numbers, no.  I would hope the Xen
> team could!
> 
> However, that does not mean the question can't be analyzed a priori.
> 
> If in fact all the backend drivers have moved to userland, to do one I/O
> from an application in a PV guest, before the situation was:
> 
>       * context-switch to PV guest kernel !
>       * context-switch to hypervisor
>       * context-switch to dom0 kernel
>       * context-switch to hypervisor
>       * context-switch to PV kernel
>       * context-switch to guest application !
> 
> I have marked the context switches which seem to me likely expensive as
> hardware virtualization support can't help save/restore state with '!'.
> I count 6 context switches with the old way.  If I understand the new way,
> it's:
> 
>       * context-switch to PV guest kernel !
>       * context-switch to hypervisor
>       * context-switch to dom0 kernel
>       * context-switch to dom0 qemu-dm !
>       * context-switch to dom0 kernel !
>       * context-switch to hypervisor
>       * context-switch to PV kernel
>       * context-switch to guest application !
> 
> So, 8 context switches, with both of the extra 2 being of the kind I suspect
> are more expensive.

That may actually be the case on a single-cpu single-core machine.

On today's SMP machines Xen schedules cpus for the dom0 and for
the domUs.

The context-switch between pv (dom0/domU) kernel and pv (dom0/domU)
userland is indeed expensive. When userland makes a system call then
libc uses the syscall instruction.

That causes a context-switch into hypervisor and hypervisor switches
into dom0-kernel. When dom0-kernel is ready it makes a sysret which
jumps back to hypervisor and hypervisor switches to userland.

Since both pv (dom0/domU) kernel and userland run in ring 3 this can
be optimized in libc by autodetecting if the kernel runs native or on
xen and in the latter case use a call-gate.

A call-gate is slightly more expensive than a function call.

Christoph


Home | Main Index | Thread Index | Old Index