Subject: port-powerpc/7240: Kernel pages faults can cause premature signal delivery and
To: None <gnats-bugs@gnats.netbsd.org>
From: None <mbrinico@nc.com>
List: netbsd-bugs
Date: 03/25/1999 19:31:36
>Number:         7240
>Category:       port-powerpc
>Synopsis:       Kernel pages faults can cause premature signal delivery and
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-powerpc-maintainer (NetBSD/powerpc Portmasters)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 25 19:35:01 1999
>Last-Modified:
>Originator:     Mark Brinicombe
>Organization:
Network Computer Inc
>Release:        NetBSD-current 1999/03/25
>Environment:
	
System: NetBSD p2.devlab.nc.com 1.3I-NCOS NetBSD 1.3I-NCOS (P2) #1: Wed Mar 17 16:37:17 PST 1999 mark@p2.devlab.nc.com:/usr/export/mark/NCOS/os-src/sys/arch/i386/compile/P2 i386


>Description:
	A bug in the powerpc trap handler can cause premature delivery of
	signals and calls to mi_switch() whilst in routines such as
	copyin(), copyout() resulting in panics with trashed stack frames
	and other faults being taken with a pcb_onfault handler being set.
	The problem occurs when kernel mode pages faults happen.
	A sucessful kernel page fault (EXC_DSI trap) does a break rather
	than a return thus falling through to the end of the switch statement
	and executing the same signal delivery and context switch code as
	for the (EXC_DSI|EXEC_USER trap) etc. If this fault was trigged from
	within copyin(), copyout() etc then the pcb_onfault handler will be set
	during any signal delivery or context that could happen at this point
	resulting in severe kernel lossage (typically a panic with a partially
	trashed stack frame).
	
>How-To-Repeat:
	Run a program that plays with lots of signals and will trigger
	copyin(), copyout() calls that fault. (first found while developing
	a X server)
>Fix:
	The break statement after a sucessful uvm_fault() call should be
	replaced with a return statement so that the signal deliveried et al.
	is not executed for page faults in the kernel.
	
*** trap.c.orig	Thu Mar 25 19:15:36 1999
--- trap.c	Thu Mar 25 19:26:37 1999
***************
*** 103,109 ****
  				ftype = VM_PROT_READ;
  			if (uvm_fault(map, trunc_page(va), 0, ftype)
  			    == KERN_SUCCESS)
! 				break;
  			if (fb = p->p_addr->u_pcb.pcb_onfault) {
  				frame->srr0 = (*fb)[0];
  				frame->fixreg[1] = (*fb)[1];
--- 103,109 ----
  				ftype = VM_PROT_READ;
  			if (uvm_fault(map, trunc_page(va), 0, ftype)
  			    == KERN_SUCCESS)
! 				return;
  			if (fb = p->p_addr->u_pcb.pcb_onfault) {
  				frame->srr0 = (*fb)[0];
  				frame->fixreg[1] = (*fb)[1];
>Audit-Trail:
>Unformatted:
context switching