NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/58561 (panic: kernel diagnostic assertion "x86_read_psl() == 0" failed: file "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581)



Hello,
sorry for taking so long for reply


On Tue, May 13, 2025 at 03:30:15PM +0000, riastradh%NetBSD.org@localhost wrote:
> Synopsis: panic: kernel diagnostic assertion "x86_read_psl() == 0" failed: file "/home/netbsd/10/src/sys/arch/x86/x86/pmap.c", line 3581
> 
> Responsible-Changed-From-To: port-xen-maintainer->bouyer
> Responsible-Changed-By: riastradh%NetBSD.org@localhost
> Responsible-Changed-When: Tue, 13 May 2025 15:30:14 +0000
> Responsible-Changed-Why:
> bouyer, can you take a look?
> 
> +cc cherry, who added the assertion back in 2011 with the cherry-xenmp
> merge.
> 
> It's possible this is just some code path that does x86_disable_intr
> without a necessary x86_read/write_psl around it to save and restore
> the interrupt-disabled flag.  But I think we've only seen it on Xen
> so far (see also dup https://gnats.netbsd.org/57543), which might help
> to narrow it down.
> 
> This #ifndef XENPV x86_disable/enable_intr looks suspicious but I have
> only superficially skimmed it and I have no idea what's going on:
> 
> https://nxr.netbsd.org/xref/src/sys/arch/amd64/amd64/trap.c?r=1.129#554
> 
> (This bug has been biting mollari a lot lately, happened again today.)

I've seen this too but only once in a (long) while. The last one
I found in my logs was last september.

I've no idea why the x86_disable/enable_intr in trap() is #ifndef XENPV.
This was added by ad@ in trap.c 1.46 (in Apr 2008) as part of the
kernel preemption work. AFAIK trap() is always called with events
enabled on Xen, so I can't see why Xen wouldn't need x86_disable_intr()
when bare metal needs it.

I'm now testing the attached patch; both amd64 and i386 domUs have
passed an anita run. I've installed in on the dom0 running the daily Xen
tests (https://www-soc.lip6.fr/~bouyer/NetBSD-tests/xen/);
it's in the middle of 2 anita runs. But I don't think I've ever seen
this KASSERT fire on this host.

If mollari is hitting this more often than what I'm seeing maybe it's
worth testing it there in a few days ?

Index: sys/arch/amd64/amd64/trap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/amd64/amd64/trap.c,v
retrieving revision 1.128
diff -u -p -u -r1.128 trap.c
--- sys/arch/amd64/amd64/trap.c	5 Sep 2020 07:26:37 -0000	1.128
+++ sys/arch/amd64/amd64/trap.c	12 Jun 2025 14:58:17 -0000
@@ -514,6 +514,10 @@ pagefltcommon:
 			goto we_re_toast;
 		}
 #endif
+#ifdef XENPV
+		/* Check to see if interrupts are enabled (ie; no events are masked) */
+		KASSERT(x86_read_psl() == 0);
+#endif
 		/* Fault the original page in. */
 		onfault = pcb->pcb_onfault;
 		pcb->pcb_onfault = NULL;
@@ -552,17 +556,13 @@ pagefltcommon:
 				 * the copy functions, and so visible
 				 * to cpu_kpreempt_exit().
 				 */
-#ifndef XENPV
 				x86_disable_intr();
-#endif
 				l->l_nopreempt--;
 				if (l->l_nopreempt > 0 || !l->l_dopreempt ||
 				    pfail) {
 					return;
 				}
-#ifndef XENPV
 				x86_enable_intr();
-#endif
 				/*
 				 * If preemption fails for some reason,
 				 * don't retry it.  The conditions won't
Index: sys/arch/i386/i386/trap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/i386/i386/trap.c,v
retrieving revision 1.308
diff -u -p -u -r1.308 trap.c
--- sys/arch/i386/i386/trap.c	20 Aug 2022 23:48:50 -0000	1.308
+++ sys/arch/i386/i386/trap.c	12 Jun 2025 14:58:17 -0000
@@ -632,6 +632,10 @@ faultcommon:
 			goto we_re_toast;
 		}
 #endif
+#ifdef XENPV
+		/* Check to see if interrupts are enabled (ie; no events are masked) */
+		KASSERT(x86_read_psl() == 0);
+#endif
 		/* Fault the original page in. */
 		onfault = pcb->pcb_onfault;
 		pcb->pcb_onfault = NULL;
@@ -670,17 +674,13 @@ faultcommon:
 				 * the copy functions, and so visible
 				 * to cpu_kpreempt_exit().
 				 */
-#ifndef XENPV
 				x86_disable_intr();
-#endif
 				l->l_nopreempt--;
 				if (l->l_nopreempt > 0 || !l->l_dopreempt ||
 				    pfail) {
 					return;
 				}
-#ifndef XENPV
 				x86_enable_intr();
-#endif
 				/*
 				 * If preemption fails for some reason,
 				 * don't retry it.  The conditions won't
-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index