Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: proposal: stop using the xen_system_time timecounter in dom0



>>>>> On Fri, 21 Jun 2024 19:58:04 +0000, "Mathew, Cherry G." <c%bow.st@localhost> said:

>>>>> On Fri, 21 Jun 2024 12:38:13 -0700, "Greg A. Woods" <woods%planix.ca@localhost> said:
>> At Fri, 21 Jun 2024 08:59:01 +0200, Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
>> Subject: Re: proposal:  stop using the xen_system_time timecounter in dom0
>>> 
>>> it is perfectly reliable for me. In my case I'm not sure (really)
>>> that other sources would be better.

>> I've shown clearly that it is not always reliable without (what would
>> otherwise be unnecessary) workarounds (dom0_vcpus_pin=true in my case,
>> and perhaps even with that workaround it is still not 100% reliable).

> A bit of argument from authority here (I did the initial Xen MP port) -
> it might be a useful policy decision to just pin all vCPUs on dom0 to
> their corresponding underlying physical cpus - the probe mechanism (from
> my memory of more than a decade ago when this was done) is that the
> number of vCPUs == the number of pCPUs - so I doubt there is any benefit
> in shuffling around vCPUS across pCPUs - perhaps quite the opposite.

> It shouldn't have very much performance implications and potentially
> performance benefits since:

> 1) All interrupts are still processed from BSP (unless things have changed
>    and our apic routing code has got cleverer) - so we'd not need to
>    xcall/IPI as much as we do now.

> 2) The described symptom seems to be from a hypervisor assumption that
>    dom0 access to "local" CPU resources would be somehow sticky.

> 3) The dom0/platform interface (eg: apic MSR based logic) was poorly
>    abstracted (not our fault - blame Xen) - and perhaps this has changed
>    in more recent versions of the hypervisor with new privileged domain
>    abstractions such as "driver domains" - but I don't think we use any
>    of that stuff yet.

> Should be a quick one-liner patch in the bootstrap MP code.

> The cat has jumped out of my bag though - so I'll leave it to someone
> else to bell the cat this time around - or not.


In case anyone cares - here's a test patch - I haven't compile tested it
- turned out to be slightly more than a one-liner, because, well, xen
APIs suck.



diff -r 1376eff6ff1c sys/arch/xen/xen/xen_clock.c
--- a/sys/arch/xen/xen/xen_clock.c      Sun Jun 23 00:53:48 2024 +0000
+++ b/sys/arch/xen/xen/xen_clock.c      Sun Jun 23 08:03:57 2024 +0000
@@ -929,6 +929,17 @@
        xen_resumeclocks(ci);

 #ifdef DOM0OPS
+#if __XEN_INTERFACE_VERSION__ >= 0x00030201
+
+       struct sched_pin_override = {
+               .pcpu = curcpu()->ci_cpuid,
+       };
+
+       return _hypercall2(int, sched_op, SCHEDOP_pin_override, &shutdown_pin_override);
+#else
+       return _hypercall2(int, sched_op, SCHEDOP_pin_override, SHUTDOWN_suspend);
+#endif
+
        /*
         * If this is a privileged dom0, start pushing the wall
         * clock time back to the Xen hypervisor.

-- 
Math/(~cherry)


Home | Main Index | Thread Index | Old Index