NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/59412: uvmpdpol_pagerealize() queue index out of bound



On Fri, May 09, 2025 at 03:40:00PM +0000, Manuel.Bouyer%lip6.fr@localhost wrote:
> >Number:         59412
> >Category:       kern
> >Synopsis:       uvmpdpol_pagerealize() queue index out of bound
> >Confidential:   no
> >Severity:       serious
> >Priority:       medium
> >Responsible:    kern-bug-people
> >State:          open
> >Class:          sw-bug
> >Submitter-Id:   net
> >Arrival-Date:   Fri May 09 15:40:00 +0000 2025
> >Originator:     Manuel Bouyer
> >Release:        NetBSD 10.1_STABLE
> >Organization:
> 	LIP6
> >Environment:
> System: NetBSD ftp.lip6.fr 10.1_STABLE NetBSD 10.1_STABLE (FTP10) #8: Wed May  7 13:24:58 CEST 2025  bouyer%armandeche.soc.lip6.fr@localhost:/local/armandeche1/tmp/build/amd64/obj/local/armandeche2/netbsd-10/src/sys/arch/amd64/compile/FTP10 amd64
> Architecture: x86_64
> Machine: amd64
> >Description:
> 	On this heavily-loaded web server I got this panic:
> login: [ 74250.5367339] uvm_fault(0xffff8d781f092b50, 0xffff8d78b6230000, 2) ->e
> [ 74250.5515087] fatal page faultfatal page fault in supervisor mode
> [ 74250.5592053] trap type 6 code 0x2 rip 0xffffffff8066f407 cs 0x8 rflags 0x100
> [ 74250.5776047] curlwp 0xffff8d782835d8c0 pid 0.4 lowest kstack 0xffffc01b20d80
> kernel: page fault trap, code=0
> Stopped in pid 0.4 (system) at  netbsd:uvmpdpol_pagerealize+0x3d:       movq
> %r12,0(%rdx,%rax,8)
> uvmpdpol_pagerealize() at netbsd:uvmpdpol_pagerealize+0x3d
> uvm_aio_aiodone_pages() at netbsd:uvm_aio_aiodone_pages+0x1a6
> uvm_aio_aiodone() at netbsd:uvm_aio_aiodone+0xb9
> dkiodone() at netbsd:dkiodone+0xb9
> biointr() at netbsd:biointr+0x61
> softint_dispatch() at netbsd:softint_dispatch+0x11c
> DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffffc01b20d8b0f0
> Xsoftintr() at netbsd:Xsoftintr+0x4c
> --- interrupt ---
> 0:
> ds          80
> es          1080
> fs          2523
> gs          c95f
> rdi         ffffc000002ad980
> rsi         0
> rbp         ffffc01b20d8ae70
> rbx         ffffffff80e416c0    boot_cpu.2
> rdx         ffff8d70b6231000
> rcx         ffff8d782835d8c0
> rax         fffffffe
> r8          300000000000000
> r9          8
> r10         80
> r11         ffffffffffff
> r12         ffffc000002ad980
> r13         0
> r14         ffff8d7371b09500
> r15         0
> rip         ffffffff8066f407    uvmpdpol_pagerealize+0x3d
> cs          8
> rflags      10282
> rsp         ffffc01b20d8ae50
> ss          10
> netbsd:uvmpdpol_pagerealize+0x3d:       movq    %r12,0(%rdx,%rax,8)
> 
> 	gdb shows that this movq would be:
>         ucpu->pdq[--(ucpu->pdqhead)] = pg;
> 
> 	and at this point, pdqhead is 0xfffffffe
> 
> 	This server uses a mix of old-stlye partitions (where biodone()
> 	is called from hard interrupt context) and wedges (where biodone()
> 	is called from soft interrupt context).
> 	My guess is that the soft interrupt thread may have been
> 	hard interrupted between the ucpu->pdqhead == 0 check and the actual
> 	use of ucpu->pdqhead. 
> 
> >How-To-Repeat:
> 	If my guess is correct, this needs a heavily-loaded server with
> 	a setup where biodone() is called from both software and hardware
> 	interrupt context on the same CPU (such as a RAID controller
> 	with both old-style partitions and wedges)
> >Fix:
> 	running uvmpdpol_pagerealize() at splvm() seems to fix the issue for me,
> 	as below:

Actually, biodone() should defer the work to a soft interrupt when called from
hard interrupt context. But it still looks like we should protect from
being interrupted by a soft interrupt while working on pdqhead.
splsoftbio() is needed for that.

updated patch:

Index: sys/spl.h
===================================================================
RCS file: /cvsroot/src/sys/sys/spl.h,v
retrieving revision 1.10
diff -u -p -u -r1.10 spl.h
--- sys/spl.h	2 Nov 2021 11:26:05 -0000	1.10
+++ sys/spl.h	12 May 2025 17:33:29 -0000
@@ -43,6 +43,9 @@
 	spl##x(void) \
 	{ return splraiseipl(makeiplcookie(IPL_##X)); }
 
+#if defined(IPL_SOFTBIO)
+_SPL_DECL(softbio, SOFTBIO)
+#endif /* defined(IPL_SOFTBIO) */
 #if defined(IPL_SOFTCLOCK)
 _SPL_DECL(softclock, SOFTCLOCK)
 #endif /* defined(IPL_SOFTCLOCK) */
Index: uvm/uvm_pdpolicy_clock.c
===================================================================
RCS file: /cvsroot/src/sys/uvm/uvm_pdpolicy_clock.c,v
retrieving revision 1.40
diff -u -p -u -r1.40 uvm_pdpolicy_clock.c
--- uvm/uvm_pdpolicy_clock.c	12 Apr 2022 20:27:56 -0000	1.40
+++ uvm/uvm_pdpolicy_clock.c	12 May 2025 17:33:29 -0000
@@ -719,6 +719,7 @@ uvmpdpol_pagerealize_locked(struct vm_pa
  * updates flushed to the global queues.  this routine may block, and
  * so can switch cpu.  the idea is to empty to queue on whatever cpu
  * we finally end up on.
+ * Must be called at splsoftbio()
  */
 static struct uvm_cpu *
 uvmpdpol_flush(void)
@@ -770,16 +771,19 @@ void
 uvmpdpol_pagerealize(struct vm_page *pg)
 {
 	struct uvm_cpu *ucpu;
+	int s;
 
 	/*
 	 * drain the per per-CPU queue if full, then enter the page.
 	 */
 	kpreempt_disable();
+	s = splsoftbio();
 	ucpu = curcpu()->ci_data.cpu_uvm;
-	if (__predict_false(ucpu->pdqhead == 0)) {
+	while (__predict_false(ucpu->pdqhead == 0)) {
 		ucpu = uvmpdpol_flush();
 	}
 	ucpu->pdq[--(ucpu->pdqhead)] = pg;
+	splx(s);
 	kpreempt_enable();
 }
 
@@ -792,6 +796,7 @@ uvmpdpol_idle(struct uvm_cpu *ucpu)
 {
 	struct uvmpdpol_globalstate *s = &pdpol_state;
 	struct vm_page *pg;
+	int s_spl;
 
 	KASSERT(kpreempt_disabled());
 
@@ -817,6 +822,7 @@ uvmpdpol_idle(struct uvm_cpu *ucpu)
 	 * check for a pending resched: in that case exit immediately.
 	 */
 	if (mutex_tryenter(&s->lock)) {
+		s_spl = splsoftbio();
 		while (ucpu->pdqhead != ucpu->pdqtail) {
 			pg = ucpu->pdq[ucpu->pdqhead];
 			if (!mutex_tryenter(&pg->interlock)) {
@@ -833,6 +839,7 @@ uvmpdpol_idle(struct uvm_cpu *ucpu)
 		if (ucpu->pdqhead == ucpu->pdqtail) {
 			ucpu->pdqtime = getticks();
 		}
+		splx(s_spl);
 		mutex_exit(&s->lock);
 	}
 }


-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index