Subject: Re: 80Mbps routing with Micrel KS8695
To: Chris Gilbert <chris@dokein.co.uk>
From: Jesse Off <joff@embeddedARM.com>
List: port-arm
Date: 01/27/2005 11:17:17
> Perhaps you could revisit using an splx that's in software only, as I 
> believe your splx writes to the hardware interrupt mask?  That said 
> shaving a few instructions there may not help, however splx and splraise 
> make up 16% of the run time...

The splx for the most part is already software only.  splFOO() doesn't touch 
hardware registers, however, if an interrupt happens, the intr handler 
checks the software copy of the spllevel and if it an interrupt that 
"shouldn't have happened", it then commits the memory copy of the spl level 
to the hardware mask and returns without running the real intr handler. 
Then, later, splx() checks if the hardware spl == software spl and if not, 
prods the hardware back to its original state.  In most cases, as long as no 
intrs happen during the splFOO() .. splx(), nothing on the hardware mask is 
touched.  If an intr happened during the splFOO() ... splx(), it will 
immediately be jumped to on the insn that puts back the hardware mask in 
splx().

Now, I took this methodology from another ARM port, but its probably not 
very necessary.  The VIC (intr controller) on the EP9302 is onchip on an 
internal 32bit bus (AHB) running at 100Mhz.  The SDRAM bus is 16bit at 
100Mhz, so in actuality, the hardware registers could actually be even 
faster than a memory access.  I think I tested this though, and the above 
described way was still faster (but only slightly).

I'm not entirely certain that its splx() itself thats slow code.  It may be 
being accounted for more time because its getting some portion of the intr 
handler preamble. (?)  Or perhaps its getting more time because of IRQ 
return/entry processor overhead and the inevitable L1 cache refills when a 
intr is returned from?  I think I need to learn more on how profiling timing 
is internally implemented and accounted for.  I could always make spl*() 
hand asm, but I'm not sure if hand optimized ASM is going to be any better 
than what GCC is already emitting for these very simple functions.

//Jesse Off