NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Subject: Re: port-xen/45975: panic: HYPERVISOR_mmu_update failed, ret: -22 on -current MP domU (amd64)
Hi Manuel,
> I've made progress on this; I think I understood what's going on and
> I have a fix.
>
>
> The page is inded still used as a page table; it's still in another CPU's
> ci_kpm_pdir. The reason is that xen_kpm_sync() is not working as expected,
> leading to races between CPUs.
> 1 the check (xpq_cpu != &x86_curcpu) is always false because we
> have different x86_curcpu symbols with different addresses in the kernel.
> Fortunably, all addresses dissaemble to the same code.
Ok, this is my messup - I was looking for a way to not use %gs/%fs
before they are setup via cpu_init_msrs(). The reason it doesn't work,
as dh pointed out elsewhere, is because x86_curcpu() is defined in a
header as static inline.
>
> 3 pmap->pm_cpus is not safe for the purpose of xen_kpm_sync(), which
> needs to know on which CPU a pmap is loaded *now*:
> pmap->pm_cpus is cleared before cpu_load_pmap() is called to switch
> to a new pmap, leaving a window where a pmap is still in a CPU's
> ci_kpm_pdir but not in pm_cpus. As a virtual CPU may be preempted
> by the hypervisor at any time, it can be large enough to let another
> CPU free the PTP and reuse it as a normal page.
>
Makes sense.
>
> To fix 2) I choose to avoid cross-calls and IPIs completely, and instead
> use a mutex to update all CPU's ci_kpm_pdir from the local CPU.
> It's safe because we just need to update the table page, a tlbflush IPI will
> happen later. As a side effect, we don't need a different code for bootstrap.
>
>
> to fix 3), I introduced a pm_xen_ptp_cpus which is updated from
> cpu_pmap_load(), whith the ci_kpm_mtx mutex held. Checking it with
> ci_kpm_mtx held will avoid overwriting the wrong pmap's ci_kpm_pdir.
>
>
> While there I removed the unused pmap_is_active() function;
> and added some more details to DIAGNOSTIC panics.
>
Can we add these broken down into separate patch sets ?
>
> The attached patch implements this; it has been tested on 4-CPUs domU
> (on a physical dual-core box, so the race described in 2) is more likely to
> happen) and I couldn't trigger the panic any more (a build.sh -j8 release
> would never complete before for me).
>
...
> @@ -387,68 +360,30 @@ pmap_kpm_sync_xcall(void *arg1, void *ar
> void
> xen_kpm_sync(struct pmap *pmap, int index)
> {
> - uint64_t where;
> + CPU_INFO_ITERATOR cii;
> + struct cpu_info *ci;
>
>
> KASSERT(pmap != NULL);
>
>
> pmap_pte_flush();
>
>
> - if (__predict_false(xpq_cpu != &x86_curcpu)) { /* Too early to xcall */
> - CPU_INFO_ITERATOR cii;
> - struct cpu_info *ci;
> - int s = splvm();
> - for (CPU_INFO_FOREACH(cii, ci)) {
> - if (ci == NULL) {
> - continue;
> - }
> - if (pmap == pmap_kernel() ||
> - ci->ci_cpumask & pmap->pm_cpus) {
> - pmap_kpm_setpte(ci, pmap, index);
> - }
> + for (CPU_INFO_FOREACH(cii, ci)) {
> + if (ci == NULL) {
> + continue;
> }
> - pmap_pte_flush();
> - splx(s);
> - return;
> - }
> -
> - if (pmap == pmap_kernel()) {
> - where = xc_broadcast(XC_HIGHPRI,
> - pmap_kpm_sync_xcall, pmap, &index);
> - xc_wait(where);
> - } else {
> - KASSERT(mutex_owned(pmap->pm_lock));
> - KASSERT(kpreempt_disabled());
> -
> - CPU_INFO_ITERATOR cii;
> - struct cpu_info *ci;
> - for (CPU_INFO_FOREACH(cii, ci)) {
> - if (ci == NULL) {
> - continue;
> - }
> - while (ci->ci_cpumask & pmap->pm_cpus) {
> -#ifdef MULTIPROCESSOR
> -#define CPU_IS_CURCPU(ci) __predict_false((ci) == curcpu())
> -#else /* MULTIPROCESSOR */
> -#define CPU_IS_CURCPU(ci) __predict_true((ci) == curcpu())
> -#endif /* MULTIPROCESSOR */
> -#if 0 /* XXX: Race with remote pmap_load() */
> - if (ci->ci_want_pmapload &&
> - !CPU_IS_CURCPU(ci)) {
> - /*
> - * XXX: make this more cpu
> - * cycle friendly/co-operate
> - * with pmap_load()
> - */
> - continue;
> - }
> -#endif /* 0 */
> - where = xc_unicast(XC_HIGHPRI,
> pmap_kpm_sync_xcall,
> - pmap, &index, ci);
> - xc_wait(where);
> - break;
> - }
> + if (pmap != pmap_kernel() &&
> + (ci->ci_cpumask & pmap->pm_xen_ptp_cpus) == 0)
> + continue;
> +
> + /* take the lock and check again */
> + mutex_enter(&ci->ci_kpm_mtx);
> + if (pmap == pmap_kernel() ||
> + (ci->ci_cpumask & pmap->pm_xen_ptp_cpus) != 0) {
> + pmap_kpm_setpte(ci, pmap, index);
Isn't a tlb shootdown needed after this, to make sure the old pte is not
referred to in the TLB ?
Separately, please note that the tlb shootdown code ignores requests to
shootdown when the pmap is being destroyed.
http://nxr.netbsd.org/xref/src/sys/arch/x86/x86/pmap_tlb.c#222
I have a hypothesis that stale TLB entries involving PTPs in the
ci_kpm_pdir[] may be responsible for the "other" bug that riz@ is seeing
?
Cheers,
Cherry.
Home |
Main Index |
Thread Index |
Old Index