Subject: port-alpha/18045: missing PMAP_TLB_SHOOTDOWN in pmap_page_protect()
To: None <gnats-bugs@gnats.netbsd.org>
From: None <mhitch@netbsd.org>
List: netbsd-bugs
Date: 08/22/2002 21:00:14
>Number:         18045
>Category:       port-alpha
>Synopsis:       When pmap_page_protect() invalidates a TLB entry, it doesn't do the corresponding tlb_shootdown
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    port-alpha-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 22 20:01:00 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Michael L. Hitch
>Release:        NetBSD 1.6_RC1 and -current
>Organization:
Montana State University, Bozeman, MT     USA
>Environment:
System: NetBSD netbsd2.msu.montana.edu 1.6_RC1 NetBSD 1.6_RC1 (CS20.MP) #1: Fri Aug 16 15:56:07 MDT 2002 mhitch@netbsd2.msu.montana.edu:/usr/NetBSD-1.6/obj/alpha/.alpha/sys/arch/alpha/compile/CS20.MP alpha
Architecture: alpha
Machine: alpha
>Description:
	When pmap_page_protect() lowers the permission on a page, it updates the
	pte on each mapping of that page, and invalidates the corresponding TLB
	entry.  In the MUTLIPROCESSOR case, the PMAP_TLB_SHOOTDOWN() call is missing,
	so the TLB entries on other cpus will not be done.
>How-To-Repeat:
	Run a multi-processor alpha system and get some apparent UBC page mapping
	errors (page contents don't match disk contents).  Start looking at pmap.c
	to see if there's any obvious errors anywhere for MP configurations.
	Examine pmap.c and note that every case of PMAP_INVALIDATE_TLB()
	is followed by a PMAP_TLB_SHOOTDOWN(), except for the one in
	pmap_page_protect().
>Fix:
The following patch adds the missing code:

Index: pmap.c
===================================================================
RCS file: /cvsroot/syssrc/sys/arch/alpha/alpha/pmap.c,v
retrieving revision 1.191
diff -u -r1.191 pmap.c
--- pmap.c	2002/03/08 20:48:28	1.191
+++ pmap.c	2002/08/23 02:38:24
@@ -1496,6 +1496,8 @@
 				PMAP_INVALIDATE_TLB(pv->pv_pmap, pv->pv_va,
 				    pmap_pte_asm(pv->pv_pte),
 				    PMAP_ISACTIVE(pv->pv_pmap, cpu_id), cpu_id);
+				PMAP_TLB_SHOOTDOWN(pv->pv_pmap, pv->pv_va,
+				    pmap_pte_asm(pv->pv_pte));
 			}
 			PMAP_UNLOCK(pv->pv_pmap);
 		}

An alternative mentioned by Jason Thorpe would be to merge the PMAP_TLB_SHOOTDOWN()
code into the PMAP_INVALIDATE_TLB() macro.

I've been running the above patch for a while, and have not noticed any problems
with UBC page contents differing from the disk data.  The problem isn't easily
reproducable though, so I can't be sure the above patch fixes the problem I was
seeing.  I was able to build a kernel from a clean directory using the above patch,
but two successive attempts resulted in a hard system hang when linking the kernel.
A third attempt to build the kernel using a MP kernel prior to the above patch also
resulted in the same hang.  The hang is probably a different problem.
>Release-Note:
>Audit-Trail:
>Unformatted: