Subject: Re: Servicing Multiple (nested) TLB Misses
To: Toru Nishimura <locore32@gaea.ocn.ne.jp>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: port-mips
Date: 12/07/2002 21:15:00
On Sun, Dec 08, 2002 at 01:24:07PM +0900, Toru Nishimura wrote:

 > LEAF_NOPROFILE(mips1_UTLBMiss)
 >         mfc0    k1, MIPS_COP_0_TLB_CONTEXT
 >         mfc0    k0, MIPS_COP_0_EXC_PC
 >         lw      k1, 0(k1)               # possible KTLBmiss here
 >         nop                             # N.B. k0 saved the original EPC
 >         mtc0    k1, MIPS_COP_0_TLB_LOW
 >         nop
 >         tlbwr
 >         jr      k0
 >         rfe
 >         .globl  _C_LABEL(mips1_UTLBMissEnd)
 > _C_LABEL(mips1_UTLBMissEnd):
 >         END(mips1_UTLBMiss)
 > 
 > The double fault may happen at the 3rd instruction above.
 > 
 > LEAF_NOPROFILE(mips1_exception)
 >         mfc0    k1, MIPS_COP_0_CAUSE
 >         nop
 >         and     k1, MIPS1_CR_EXC_CODE
 >         sub     k1, T_TLB_LD_MISS << 2  # anticipating UTLBmiss
 >         bnez    k1, 1f
 >         la      k1, _C_LABEL(mips1_GXCPT)
 >         jr      k1                      # preserve k0 for load miss trap
 >         nop
 > 1:      mfc0    k1, MIPS_COP_0_CAUSE

...it's kind of a bummer that you have to fetch CAUSE twice for all
non-TLB-miss exceptions now...

 >         la      k0, mips1_xcptsw
 >         and     k1, MIPS1_CR_EXC_CODE
 >         add     k1, k1, k0
 >         lw      k1, 0(k1)               # dispatch
 >         nop
 >         jr      k1
 >         nop
 >         .globl  _C_LABEL(mips1_exceptionEnd)
 > _C_LABEL(mips1_exceptionEnd):
 >         END(mips1_exception)
 > 
 > The double faulting special case is handled through common
 > trap() code.
 > 
 > void
 > trap(status, cause, opc, frame)
 >         unsigned status;
 >         unsigned cause;
 >         vaddr_t opc;
 >         struct frame *frame;
 > {
 >         ...
 >         case T_TLB_LD_MISS:
 >                 /* layout linear PTE in VPT and AVPT for TLB refill */
 >                 if (pdei(vaddr) == 1018 || pdei(vaddr) == 1019) {
 >                         pt_entry_t **pdp, *ptp;
 > 
 >                         pdp = curpcb->pcb_pmap->pm_pdir;
 >                         /* loopback to myself or refer to otherone */
 >                         pdp = (pt_entry_t **)pdp[pdei(vaddr)];
 >                         /* take KSEG0 address of PT page */
 >                         ptp = pdp[ptei(vaddr)];
 >                         if (ptp == NULL) {
 >                                 /* hit 4MB desert hole, masquerade PG_NV */
 >                                 MIPS_TLBWR(vaddr, desertpte);
 >                         }
 >                         else {
 >                                 /* map the PT page in VPT/AVPT space */
 >                                 pte = PG_V | PG_D;
 >                                 if (ptei(vaddr) >= 768)
 >                                         pte |= PG_G;
 >                                 pte |= MIPS_KSEG0_TO_PTE(ptp);
 >                                 MIPS_TLBWR(vaddr, pte);
 >                         }
 >                         /* detour lw fault during UTLBmiss; MIPS1 only */
 >                         if (cpu_arch == CPU_ARCH_MIPS1
 >                              && opc == (0x80000000+sizeof(int)*2))
 >                                 frame->f_pc = 0x80000000+sizeof(int)*7;

So, I see that you write the TLB entry for the PT page... but then you
skip the "tlbwr" for the original UTLBMiss fault ... so, I guess it
restarts the instruction, faults again, and now the PT page will be
in the TLB ... okay, that makes sense, but the double fault is kind of
unfortunate.  It would be nice to instrument how often the double-fault
case happens.

 >                         return;
 >                 }
 >                 /* FALLTHRU */
 >         case T_TLB_ST_MISS:
 >                 /* TLB refill for kernel space; MIPS1 only */
 >                 ...
 > It'd be necessary to depict a clear picture to explain how the linear PTE
 > is arranged in those address ranges (1018 * 4MB or 1019 * 4MB)
 > In short, it's done just like as what NetBSD/i386 or NetBSD/pc532
 > "fools" their MMU.

This is precicely how the Alpha works, as well (except the Alpha uses a
3-level table, rather than 2-level).  The Alpha hardware is actually a
software-managed TLB, like MIPS, but the TLB miss handlers are all in
PALcode.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>