Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen timecounter issues



On Mon, Jun 24, 2024 at 09:22:54AM +0000, Mathew, Cherry G. wrote:
> >>>>> On Mon, 24 Jun 2024 10:48:31 +0200, Manuel Bouyer <bouyer%antioche.eu.org@localhost> said:
> 
> > On Mon, Jun 24, 2024 at 04:42:19AM -0400, Brad Spencer wrote:
> >> Manuel Bouyer <bouyer%antioche.eu.org@localhost> writes:
> >> 
> >> > On Sun, Jun 23, 2024 at 01:58:36PM +0000, Taylor R Campbell wrote:
> >> >> It came to my attention today that there has been a lot of discussion
> >> >> recently about timecounters on Xen:
> >> >> 
> >> >> https://mail-index.netbsd.org/port-xen/2024/02/01/msg010525.html
> >> >> https://mail-index.netbsd.org/port-xen/2024/06/20/msg010573.html
> >> >> 
> >> >> These threads are long and I wasn't following because I'm not
> >> >> subscribed to port-xen, but since I wrote xen_clock.c and I remember
> >> >> roughly how it works, maybe my input might be helpful.  Is there a
> >> >> summary of the issues?
> >> >> 
> >> >> 1. Is there an array of the following variables?
> >> >> 
> >> >>    - dom0 kernel (netbsd-8, netbsd-9, netbsd-10, linux, freebsd, ...)
> >> >>    - domU kernel, if misbehaviour observed in domU (ditto)
> >> >>    - Xen kernel version
> >> >>    - virtualization type (pv/pvh/hvm/...)
> >> >>    - Xen TSC configuration
> >> >>    - physical CPU (and, whether the physical CPU has invariant TSC)
> >> >>    - misbehaviour summary
> >> >
> >> > AFAIK no. From what I understood the misbehavior is only seen in dom0.
> >> > All I can say is that I've run NetBSD Xen dom0 on various generation of
> >> > Intel CPUs (from P4 to quite recent Xeon) and I never had any issue with
> >> > timekeeping in dom0 (all my dom0 runs ntpd)
> >> 
> >> Another factor might be the number of vcpus allocated to Domain-0.  I
> >> use only 1 and have no trouble with time keeping on two Intel i7/i8
> >> systems and one very old AMD Athlon II.  One of the other reporters is
> >> using more than one vcpu with Domain-0 and is having trouble with time
> >> keeping and has found that cpu pining solves the problem.  I am also
> >> running Xen 4.15 and he is running 4.18 (I believe).
> 
> > I'm switching from 4.15 to 4.18, and with netbsd-10 I'm running
> > dom0 with all available CPUs (and I have done so on my test machine
> > running -current for some time now)
> 
> > I don't think the number of vCPUs is the factor here, as even with one vCPU
> > it's not pinned to a physical CPU.
> 
> The Xen hypervisor pinning logic for dom0 boot time is special-cased for
> dom0/PV and pinned to the BSP (which is the only mode we use, last I
> checked).
> 
> This means, that until we spin up the rest of the AP vCPUs, vcpu0 is
> pinned to the underlying BSP. Note that our probe logic only spins up
> the underlying number of pcpus - however, without pinning specified, the
> additional vCPUs are free to be scheduled onto other pCPUs - which is
> what probably triggered the tsc drift that Greg observed on !invariant
> TSC h/w - since a sequence of reads are not guaranteed to be made on the
> same underlying pCPU. This is a symptom of Xen's poor API abstraction
> for h/w resource sharing on dom0 - I think this was fixed later for
> newer modes, but I'm not up to date on that.
> 
> Here's the relevant code snippet:
> 
> xen/common/sched/core.c:sched_init_vcpu()
> 
> ...
>     else if ( pv_shim && v->vcpu_id == 0 )
>     {
>         /*
>          * PV-shim: vcpus are pinned 1:1. Initially only 1 cpu is online,
>          * others will be dealt with when onlining them. This avoids pinning
>          * a vcpu to a not yet online cpu here.
>          */
>         sched_set_affinity(unit, cpumask_of(0), cpumask_of(0));
>     }
>     else if ( d->domain_id == 0 && opt_dom0_vcpus_pin )
>     {
>         /*
>          * If dom0_vcpus_pin is specified, dom0 vCPUs are pinned 1:1 to
>          * their respective pCPUs too.
>          */
>         sched_set_affinity(unit, cpumask_of(processor), &cpumask_all);
>     }

But this doens't tell anything about CPU 0 ?
Indeed with the default boot options vCPUs are not pinned (we did know that)

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index