NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/55658: rumpvfs:t_etfs triggers bug in ufs_balloc_range() for 16KB page



>Number:         55658
>Category:       kern
>Synopsis:       rumpvfs:t_etfs triggers bug in ufs_balloc_range() for 16KB page
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Sep 14 08:55:00 +0000 2020
>Originator:     Rin Okuyama
>Release:        9.99.72
>Organization:
Department of Physics, Meiji University
>Environment:
NetBSD obs266 9.99.72 NetBSD 9.99.72 (OBS266) #385: Mon Sep 14 17:37:35 JST 2020  rin@latipes:/sys/arch/evbppc/compile/OBS266 evbppc
>Description:
On powerpc/ibm4xx (16KB page), tests/rump/rumpvfs/t_etfs causes kernel
freeze where it falls into infinite recursion in uvm_fault_internal()
called by DSI (data storage interruption) *write* handler:

----
# cd /usr/tests/rump/rumpvfs && atf-run t_etfs
...
tc-se:1+0 records in
tc-se:1+0 records out
tc-se:1 bytes transferred in 0.-45 secs (0 bytes/sec)
(freeze here)
~Stopped in pid 297.274 (t_etfs) at      netbsd:comintr+0x890:   stw     r23, 0x6
c(r25)
db> bt
0x06c7b590: at intr_deliver+0x7c
0x06c7b5c0: at pic_handle_intr+0xf8
0x06c7b610: at cpu_lwp_bootstrap+0x4d4
0x06c7b6e0: at genfs_getpages+0x92c
0x06c7b7e0: at VOP_GETPAGES+0x58
0x06c7b820: at uvn_get+0x8c
0x06c7b860: at ubc_fault+0x1a8
0x06c7b8e0: at uvm_fault_internal+0x4dc
0x06c7ba30: at trap+0x338
0x06c7bae0: kernel DSI write trap @ 0x84204ffc by memcpy+0x84: srr1=0x8030
            r1=0x6c7bbb0 cr=0x40222434 xer=0 ctr=0x400 dear=0x84204ffc esr=0x800
000 pid=0x1
0x06c7bbb0: at main+0x790
0x06c7bbd0: at copyin+0x1a4
0x06c7bc50: at copyin_vmspace+0x4c
0x06c7bc90: at uiomove+0x188
0x06c7bcd0: at ubc_uiomove+0x1b8
0x06c7bd40: at ffs_write+0x67c
0x06c7bdd0: at VOP_WRITE+0x3c
0x06c7be00: at vn_write+0x140
0x06c7be30: at dofilewrite+0x8c
0x06c7be80: at sys_pwrite+0xdc
0x06c7beb0: at syscall+0x2e4
0x06c7bf20: user SC trap #174 by 0xfda0cd4c: srr1=0xc030
            r1=0xfacdbf40 cr=0x28002482 xer=0x20000000 ctr=0xfda0cd44 esr=0x800000 pid=0x42
----

This turned out to be a bug in ufs_balloc_range() for machines with
16KB (and above) page, introduced by this commit:

http://mail-index.netbsd.org/source-changes/2005/07/17/0011.html
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/ufs/ufs/ufs_inode.c#rev1.50

> --- src/sys/ufs/ufs/ufs_inode.c	2005/07/10 01:08:52	1.49
> +++ src/sys/ufs/ufs/ufs_inode.c	2005/07/17 09:13:35	1.50
> ...
> @@ -273,9 +273,11 @@ ufs_balloc_range(struct vnode *vp, off_t
>  
>  	simple_lock(&uobj->vmobjlock);
>  	for (i = 0; i < npages; i++) {
> -		pgs[i]->flags &= ~PG_RDONLY;
>  		if (error) {
>  			pgs[i]->flags |= PG_RELEASED;
> +		} else if (off <= pagestart + (i << PAGE_SHIFT) &&
> +		    pagestart + ((i + 1) << PAGE_SHIFT) <= off + len) {
> +			pgs[i]->flags &= ~PG_RDONLY;
>  		}
>  	}
>  	if (error) {

Comment (added later) explains this code as:

>         /*
>          * if the allocation succeeded, mark all the pages dirty
>          * and clear PG_RDONLY on any pages that are now fully backed
>          * by disk blocks. ...
>          */

By inserting debug printf:

----
--- ufs_inode.c	5 Sep 2020 16:30:13 -0000	1.112
+++ ufs_inode.c	14 Sep 2020 08:26:12 -0000
@@ -275,8 +275,22 @@ ufs_balloc_range(struct vnode *vp, off_t
 		if (!error) {
 			if (off <= pagestart + (i << PAGE_SHIFT) &&
 			    pagestart + ((i + 1) << PAGE_SHIFT) <= eob) {
+#if 1 /* XXX */
+
+printf("%s: OK: pa %#jx off %#jx eob %#jx\n",
+    __func__, (intmax_t)pgs[i]->phys_addr, (intmax_t)off, (intmax_t)eob);
+
+#endif
 				pgs[i]->flags &= ~PG_RDONLY;
 			}
+			else {
+#if 1 /* XXX */
+
+printf("%s: NG: pa %#jx off %#jx eob %#jx\n",
+    __func__, (intmax_t)pgs[i]->phys_addr, (intmax_t)off, (intmax_t)eob);
+
+#endif
+			}
 			uvm_pagemarkdirty(pgs[i], UVM_PAGE_STATUS_DIRTY);
 		}
 		uvm_pagelock(pgs[i]);
----

ufs_balloc_range() called for 8KB region:

----
# cd /usr/tests/rump/rumpvfs && atf-run t_etfs
...
tc-se:1+0 records in
tc-se:1+0 records out
tc-se:1 bytes transferred in 0.-45 secs (0 bytes/sec)
ufs_balloc_range: NG: pa 0x6c6c000 off 0x60000000000 eob 0x60000002000
                         ^^^^^^^^^                                ^^^^
----

For machines with page size <= 8KB, this changes a target page writable
(confirmed on other ports with 4 and 8KB pages with debug printf above).

However, for machines with 16KB page, the condition above will never be
satisfied. As a result, uvm_fault_internal() will be falling into
infinite recursion, trying to turn that page writable. This ends up with
kernel freeze with TLB filled up with invalided entry for that page:

(continue from previous)
----
Stopped in pid 297.274 (t_etfs) at      netbsd:comintr+0x890:   stw     r23, 0x6
c(r25)
db> machine tlb
tlb  0  PID   1 EPN 0x00000000 16MB   RPN 0x00000000  ZONE  0N  EX WR
tlb  1  PID   1 EPN 0x08000000 16MB   RPN 0xef000000  ZONE  0N     WR  I G
tlb* 2  PID   1 EPN 0x84204000 16kB   RPN 0x06c6c000  ZONE  0N
                                          ^^^^^^^^^^               ^^ (RO)
tlb* 3  PID   1 EPN 0x84204000 16kB   RPN 0x06c6c000  ZONE  0N
tlb* 4  PID   1 EPN 0x84204000 16kB   RPN 0x06c6c000  ZONE  0N
...
----
(* means invalid entry)
>How-To-Repeat:
On powerpc/ibm4xx (our only port with 16KB page),

# cd /usr/tests/rump/rumpvfs && atf-run t_etfs
>Fix:
I'm not 100% sure whether this change is correct, but with this patch,
that test does no longer cause kernel freeze on powerpc/ibm4xx, and no
regression has been observed over a month for ibm4xx and other ports:

http://www.netbsd.org/~rin/ufs_inode_20200914.patch
----
Index: ufs_inode.c
===================================================================
RCS file: /home/netbsd/src/sys/ufs/ufs/ufs_inode.c,v
retrieving revision 1.112
diff -p -u -r1.112 ufs_inode.c
--- ufs_inode.c	5 Sep 2020 16:30:13 -0000	1.112
+++ ufs_inode.c	14 Sep 2020 08:03:05 -0000
@@ -260,7 +260,7 @@ ufs_balloc_range(struct vnode *vp, off_t
 
 	/*
 	 * if the allocation succeeded, mark all the pages dirty
-	 * and clear PG_RDONLY on any pages that are now fully backed
+	 * and clear PG_RDONLY on any pages that are now backed
 	 * by disk blocks.  if the allocation failed, we do not invalidate
 	 * the pages since they might have already existed and been dirty,
 	 * in which case we need to keep them around.  if we created the pages,
@@ -274,7 +274,7 @@ ufs_balloc_range(struct vnode *vp, off_t
 		KASSERT((pgs[i]->flags & PG_RELEASED) == 0);
 		if (!error) {
 			if (off <= pagestart + (i << PAGE_SHIFT) &&
-			    pagestart + ((i + 1) << PAGE_SHIFT) <= eob) {
+			    pagestart + (i << PAGE_SHIFT) <= eob) {
 				pgs[i]->flags &= ~PG_RDONLY;
 			}
 			uvm_pagemarkdirty(pgs[i], UVM_PAGE_STATUS_DIRTY);
----



Home | Main Index | Thread Index | Old Index