tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: assertion "(l->l_pflag & LP_INTR) == 0" failed



> Date: Tue, 4 Oct 2022 14:08:18 +0000
> From: Emmanuel Dreyfus <manu%netbsd.org@localhost>
> 
> I follow up here after getting no success at
> http://mail-index.netbsd.org/port-xen/2022/10/03/msg010258.html
> 
> Applying the change below to netbsd-9 branch would work around
> the problem. Any opinion?
> [...]
> +#ifndef XENPV
>                 KASSERT((l->l_pflag & LP_INTR) == 0);
> +#endif
>                 if (cpu_intr_p()) {
>                         return;
>                 }

This patch can't be right -- there's nothing specific to Xen about
this logic.

This happen because xbd_handler tries to free something which it got
with VM_NOSLEEP.  In interrupt context, either

(a) that's forbidden, in which case we should pull up xbd_xenbus.c
    rev. 1.111 (and possibly subsequent related changes) where
    jdolecek@ preallocated the buffer; or

(b) that's allowed, in which case this assertion is wrong: it will be
    hit whenever we have a page table page to free as a consequence of
    unmapping the buffer from kva -- or possibly a consequence of
    _something else_ being unmapped in some random code that the
    interrupt handler happened to interrupt.

Just pulling up xbd_xenbus.c rev. 1.111 (and subsequent changes) might
be enough to work around this, but there may also be a larger class of
latent problems lurking here.

Although it's better for interrupt handlers not to deal with
allocating and deallocating memory (and touching the pmap), there's so
many drivers out there that do this (with KM_NOSLEEP or PR_NOSLEEP or
BUS_DMA_NOWAIT) that I don't think it's reasonable to insist on (a),
so let's suppose it's (b).

The assertion was introduced when yamt@ convertd it from a conditional
in pmap.c rev. 116 in 2011, with commit message `assertions':

        if (l->l_md.md_gc_ptp != NULL) {
-               if (cpu_intr_p() || (l->l_pflag & LP_INTR) != 0) {
+               KASSERT((l->l_pflag & LP_INTR) == 0);
+               if (cpu_intr_p()) {
                        return;

yamt@ also added the same assertion to pmap_freepage:

-       VM_PAGE_TO_PP(ptp)->pp_link = curlwp->l_md.md_gc_ptp;
-       curlwp->l_md.md_gc_ptp = ptp;
+       l = curlwp;
+       KASSERT((l->l_pflag & LP_INTR) == 0);
+       VM_PAGE_TO_PP(ptp)->pp_link = l->l_md.md_gc_ptp;
+       l->l_md.md_gc_ptp = ptp;

Note that (l->l_pflag & LP_INTR) != 0 is better phraseed as
cpu_softintr_p().

If the idea is that hard _and_ soft interrupts are forbidden from
doing pmap_remove, then the assertion is fine -- although it should
also be KASSERT(!cpu_intr_p()).

But if hard interrupt handlers are allowed to do pmap_remove (via,
e.g., kmem_intr_free or pool_put or pool_cache_put), then the
assertions are just wrong for two reasons:

1. Whatever is allowed in hard interrupt context must also be allowed
   in soft interrupt context.

2. The hard interrupt might have interrupted a soft interrupt handler,
   in which case cpu_softintr_p() will be true, because l is a softint
   lwp, so (l->l_pflag & LP_INTR) != 0.  So we can't rule this case
   out -- we couldn't rule it out even if for some reason
   kmem_intr_free were forbidden in soft interrupts but allowed in
   hard interrupts.

So I think the assertions should be removed from pmap_freepage and
pmap_update, unless someone can convince me that netbsd-9 is ready to
never have a path to pmap_remove and pmap_update in hard interrupt
context, using only a handful of minor driver patches at most.


(All of this has become moot in HEAD because the pmap logic was
substantially reworked and the assertions are no longer there.) 


Home | Main Index | Thread Index | Old Index