Subject: Re: mips3 problem on todays kernel?
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Michael L. Hitch <mhitch@lightning.oscs.montana.edu>
List: port-mips
Date: 01/27/1999 19:50:44
On Wed, 27 Jan 1999, Jonathan Stone wrote:

> 
> I'm seeing what looks like the same thing Michael Hitch mentioned last
> week. (trace below). What's the most recent source from which anyone's
> built a mips3 pmax kernel that acutally booted?

  Yep - it's the same thing.  Here's a message I have been trying to get
finished:

The R4x00 support on DECstations since January 16 appears to be broken. 
I was able to boot a kernel built from the January 9 sources, but it
fails since the January 16 sources.

The problem appears to be the TLB miss handling.  It fails in memset()
when uvm_anon_init() is clearing 'anon', and appears to occur when
memset() crosses a page boundary.  The failing virtual address doesn't
match what's in the register containing the destination address of the
store word.

It took me a while to figure out exactly how the TLB miss handler was
supposed to work, and several times I thought I saw something that
didn't seem right, but after closer examination of the code, it appeared
like it should be working.

Finally I started putting DDB breakpoints in the TLB miss handler to try
to figure out if it was even getting called, and exactly what was
getting executed.  [Thanks much to Jonathan for adding the capability of
entering DDB early in the kernel boot.]

What I found was that the TLB miss handler was taking the code path for
a USEG miss even though the fault occurred on a KSEG2 address.

After further examination of how the handler determines if the fault is
a kernel or user address, I think I finally determined what the problem
is.

_C_LABEL(mips3_TLBMiss):
        .set    noat
        dmfc0   k0, MIPS_COP_0_TLB_XCONTEXT
        addu    k1, $0, k0              # Pick up up sign of k0 for user/kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        bltz    k1, 4f

The dmfc0 loads the 64-bit contents of the 'xcontext' register into k0.
The upper 31 bits is the address of the segtab for the current user
process, the next 2 bits is the "region" field (bits 63-62 of the
virtual address), and the rest of the lower bits are from the VPN
(virtual page number) of the virtual address.  The addu appears to be
trying to get to bit 31 of k0, which is the low bit of the "region"
field [00 = user, 01 = supervisor, and 11 = kernel].  I think this addu
instruction is where this fails.

For the ADDU instructions, one manual I have [MIPS RE4000 Microprocessor
User's Manual, 2nd edition, by Joe Heinrich] states:
  "In 64-bit mode, the operands must be valid sign-extended, 32-bit values."

Another manual [IDT MIPS Microprocessor Family Software Reference Manual,
version 2] states:
  "On 64 bit processors, if either GRP 'rt' or GRP 'rs' do not contain
  sign-extended 32-bit values (bits 63..31 equal), the result of the
  operation is undefined."

The 64-bit value of k0 at this point does not contain a valid
sign-extended 32-bit value [bits 63-33 are 0, but bit 32 and bit 31 are
both 1 since it's a KSEG2 address].  Since that's "undefined" or "not
valid" for the ADDU instruction, the k1 contents won't be valid and the
test for a kernel/user TLB miss may fail [and does on my R4400].

The same thing is done in mips3_TLBInvalidException.

Here's how I fixed it:


--- /c/src/sys/arch/mips/mips/locore_mips3.S	Sat Jan 16 05:20:31 1999
+++ ./locore_mips3.S	Wed Jan 27 19:38:15 1999
@@ -141,15 +141,15 @@
 	.globl	_C_LABEL(mips3_TLBMiss)
 _C_LABEL(mips3_TLBMiss):
 	.set	noat
-	dmfc0	k0, MIPS_COP_0_TLB_XCONTEXT
-	addu	k1, $0, k0		# Pick up up sign of k0 for user/kernel
-	bltz	k1, 4f
-	srl	k1, k0, (SEGSHIFT - 2 - (PGSHIFT - 3))
-	andi	k1, k1, 0x7fc		# index of segment table
-	dsra	k0, k0, 32		# Tricky -- The lower bit is
+	dmfc0	k0, MIPS_COP_0_BAD_VADDR	# get the virtual address
+	dmfc0	k1, MIPS_COP_0_TLB_XCONTEXT
+	bltz	k0, 4f
+	srl	k0, k0, SEGSHIFT - 2	# compute segment table index
+	andi	k0, k0, 0x7fc		# index of segment table
+	dsra	k1, k1, 32		# Tricky -- The lower bit is
 					# actually part of KSU but we must
 					# be a user address
-	add	k1, k0, k1		
+	add	k1, k0, k1
 	lw	k1, 0(k1)		
 	dmfc0	k0, MIPS_COP_0_BAD_VADDR	# get the virtual address
 	beq	k1, zero, 5f			# invalid segment map
@@ -179,7 +179,7 @@
 	nop
 	eret
 4:
-	j	mips3_TLBMissException
+	j	_C_LABEL(mips3_TLBMissException)
 	nop
 5:
 	j	mips3_SlowFault
@@ -1030,11 +1030,9 @@
  */
 LEAF(mips3_TLBInvalidException)
 	.set	noat
-	dmfc0	k0, MIPS_COP_0_TLB_XCONTEXT
-	addu	k1, k0, 0			# Get User/Kernel mode bit
-	bgez	k1, _C_LABEL(mips3_KernGenException)	# full trap processing
 	dmfc0	k0, MIPS_COP_0_BAD_VADDR	# get the fault address
 	li	k1, VM_MIN_KERNEL_ADDRESS	# compute index
+	bgez	k0, _C_LABEL(mips3_KernGenException)	# full trap processing
 	subu	k0, k0, k1
 	lw	k1, _C_LABEL(Sysmapsize)	# index within range?
 	srl	k0, k0, PGSHIFT