Subject: port-m68k/2547: wrong bus error detection code for 68020/030
To: None <gnats-bugs@NetBSD.ORG>
From: Ignatios Souvatzis <ignatios@cosinus.cs.uni-bonn.de>
List: netbsd-bugs
Date: 06/13/1996 21:50:50
>Number:         2547
>Category:       port-m68k
>Synopsis:       wrong bus error detection code for 68020/030
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    gnats-admin (GNATS administrator)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jun 13 16:05:02 1996
>Last-Modified:
>Originator:     Ignatios Souvatzis
>Organization:
computer science department, university of Bonn, Germany
>Release:        1.2_ALPHA
>Environment:
	
System: NetBSD cosinus.cs.uni-bonn.de 1.2_ALPHA NetBSD 1.2_ALPHA (COSINUS) #10: Thu Jun 13 20:07:19 MET DST 1996 ignatios@cosinus.cs.uni-bonn.de:/usr/src/sys/arch/i386/compile/COSINUS i386


>Description:
	
There is a serious bug in the 68020+68851 / 68030 branch of the buserr 
handler on all our m68k ports:

For these CPU's, you have to use the ptest operation to search the mmu
tables in order to decide whether it is a real bus error or just a
page fault or write protection violation.

Our old code assumed user space always when calling ptest, 

	(ptest #1,a0@,#7)
and erroneously assumed the BUSERR bit in the ptest output 
	pmove mmusr,sp@
	btst #7,sp@
(mmusr register) is the only indication for bus errors to check.

In fact, we have to follow a multistage decision tree to decide.

Our old code will never correctly detect a bus error on non-mmu table
memory, and will report to the higher layer incorrectly access faults
by the kernel.

>How-To-Repeat:
	Try to help Niklas Hallquist diagnose what turned out to be
a real bus error.

>Fix:
For most of our m68k architectures, this should apply; others will need
hand-editing.


*** locore.s	Thu Jun 13 21:19:33 1996
--- /home/theory/ignatios/locore.s	Thu Jun 13 20:32:11 1996
***************
*** 205,216 ****
  	cmpw	#12,d0			| address error vector?
  	jeq	Lisaerr			| yes, go to it
  	movl	d1,a0			| fault address
! 	ptestr	#1,a0@,#7		| do a table search
  	pmove	psr,sp@			| save result
! 	btst	#7,sp@			| bus error bit set?
! 	jeq	Lismerr			| no, must be MMU fault
! 	clrw	sp@			| yes, re-clear pad word
! 	jra	Lisberr			| and process as normal bus error
  Lismerr:
  	movl	#T_MMUFLT,sp@-		| show that we are an MMU fault
  	jra	Ltrapnstkadj		| and deal with it
--- 205,234 ----
  	cmpw	#12,d0			| address error vector?
  	jeq	Lisaerr			| yes, go to it
  	movl	d1,a0			| fault address
! 	movl	sp@,d0			| function code from ssw
! 	btst	#8,d0			| data fault?
! 	jne	Lbe10a
! 	movql	#1,d0			| user program access FC
! 					| (we dont seperate data/program)
! 	btst	#5,a1@			| supervisor mode?
! 	jeq	Lbe10a			| if no, done
! 	movql	#5,d0			| else supervisor program access
! Lbe10a:
! 	ptestr	d0,a0@,#7		| do a table search
  	pmove	psr,sp@			| save result
! 	movb	sp@,d1
! 	btst	#2,d1			| invalid (incl. limit viol. and berr)?
! 	jeq	Lmightnotbemerr		| no -> wp check
! 	btst	#7,d1			| is it MMU table berr?
! 	jeq	Lismerr			| no, must be fast
! 	jra	Lisberr1		| real bus err needs not be fast.
! Lmightnotbemerr:
! 	btst	#3,d1			| write protect bit set?
! 	jeq	Lisberr1		| no: must be bus error
! 	movl	sp@,d0			| ssw into low word of d0
! 	andw	#0xc0,d0		| Write protect is set on page:
! 	cmpw	#0x40,d0		| was it read cycle?
! 	jeq	Lisberr1		| yes, was not WPE, must be bus err
  Lismerr:
  	movl	#T_MMUFLT,sp@-		| show that we are an MMU fault
  	jra	Ltrapnstkadj		| and deal with it
***************
*** 217,222 ****
--- 235,242 ----
  Lisaerr:
  	movl	#T_ADDRERR,sp@-		| mark address error
  	jra	Ltrapnstkadj		| and deal with it
+ Lisberr1:
+ 	clrw	sp@			| re-clear pad word
  Lisberr:
  	movl	#T_BUSERR,sp@-		| mark bus error
  Ltrapnstkadj:
>Audit-Trail:
>Unformatted: