Subject: CP0 count register
To: None <port-mips@netbsd.org>
From: Matthew Luckie <mjl@luckie.org.nz>
List: port-mips
Date: 05/15/2007 12:18:36
I would like to be able to measure the number of clock cycles a routine
takes in user-space when running on a MIPS in order to measure the
performance of a routine as accurately as possible.

It seems that the count register on CP0 is what I want to use, but that
it cannot be accessed from user-space unless the appropriate bit in
the CP0 status register is set to one.  That is:

mips_cp0_status_write(mips_cp0_status_read() | MIPS_SR_COP_0_BIT);

Is it possible, at all, to access CP0 in userspace if the kernel is
modified to allow access to CP0 by user-space?  documentation such as
http://6004.csail.mit.edu/6.371/handouts/mips6371.pdf suggests it is,
but that is all I have to work with, and I am not intimately familiar
with NetBSD on the MIPS CPU.

If it is, I was thinking of modifying sys/arch/mips/mips/vm_machdep.c 
cpu_lwp_fork() and cpu_setfunc() as follows:

--- vm_machdep.c.orig   2007-05-15 12:07:43.000000000 +1200
+++ vm_machdep.c        2007-05-15 12:14:01.000000000 +1200
@@ -171,7 +171,12 @@ cpu_lwp_fork(struct lwp *l1, struct lwp 
        pcb->pcb_context[1] = (intptr_t)arg;            /* S1 */
        pcb->pcb_context[8] = (intptr_t)f;              /* SP */
        pcb->pcb_context[10] = (intptr_t)proc_trampoline;       /* RA */
-       pcb->pcb_context[11] |= PSL_LOWIPL;             /* SR */
+
+       if(kauth_cred_getuid(l2->l_cred) == 0)
+               pcb->pcb_context[11] |= (PSL_LOWIPL | MIPS_SR_COP_0_BIT);
+       else
+               pcb->pcb_context[11] |= PSL_LOWIPL;             /* SR */
+
 #ifdef IPL_ICU_MASK
        pcb->pcb_ppl = 0;       /* machine dependent interrupt mask */
 #endif
@@ -195,7 +200,12 @@ cpu_setfunc(struct lwp *l, void (*func)(
        pcb->pcb_context[1] = (intptr_t)arg;                    /* S1 */
        pcb->pcb_context[8] = (intptr_t)f;                      /* SP */
        pcb->pcb_context[10] = (intptr_t)proc_trampoline;       /* RA */
-       pcb->pcb_context[11] |= PSL_LOWIPL;                     /* SR */
+
+       if(kauth_cred_getuid(l2->l_cred) == 0)
+               pcb->pcb_context[11] |= (PSL_LOWIPL | MIPS_SR_COP_0_BIT);
+       else
+               pcb->pcb_context[11] |= PSL_LOWIPL;             /* SR */
+
 #ifdef IPL_ICU_MASK
        pcb->pcb_ppl = 0;       /* machine depenedend interrupt mask */
 #endif

Is this likely to work, or am I trying to do something that the MIPS
CPU working with NetBSD is not able to support?  The hardware I'm
using is NetBSD/cobalt running 4.0_BETA2 200703090002Z

cpu0 at mainbus0: QED RM5200 CPU (0x28a0) Rev. 10.0 with built-in FPU Rev. 10.0
cpu0: 32KB/32B 2-way set-associative L1 Instruction cache, 48 TLB entries
cpu0: 32KB/32B 2-way set-associative write-back L1 Data cache

Thanks

Matthew