Port-powerpc archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: 4xx copyin/copyout [Was: CVS commit: src/sys/arch/powerpc/ibm4xx]
Simon Burge wrote:
Juergen Hannken-Illjes wrote:
This breaks with alignment trap on the 1st copyout for evbppc/explora:
kaddr=0x2fda88 udaddr=0xfffebff5 len=11
Trap is at this line:
" stw %[tmp],0(%[udaddr]);" /* Store user word */
If this works, we can possibly also look at unrolling the word loop a
bit since the load/store string instructions can do up to 32-bytes per
instruction. Does anyone know how to request a number of consecutive
registers with gcc asm constraints? If not, it might be easier to break
these two assembly fragments out to their own .S file...
If this patch doesn't fix the Explora, we can just add a alignment check
(if on a 403) and skip the word-at-a-time loop, although the trailing
loop will need to be updated to not just use "len % 4" bytes.
Sorry for the breakage Jeurgen ...
I briefly tried a minor unroll but it didn't work on the first attempt
and I had other problems to deal with so I punted ... It was decidedly
less than elegant but I just wanted to measure the impact on performance
to see if it was worth the effort ...
" srwi %[count],%[len],0x2;"
" beq- 2f;"
"1: mtpid %[pid];sync;"
" andi. %[tmp],%[count],3;"
" beq 111f;"
" andi. %[tmp],%[count],2;"
" beq 110f;"
" andi. %[tmp],%[count],1;"
" beq 101f;"
" b 100f;"
"111:lwz %[tmp4],12(%[kaddr]);"
"110:lwz %[tmp3],8(%[kaddr]);"
"101:lwz %[tmp2],4(%[kaddr]);"
"100:lwz %[tmp1],0(%[kaddr]);"
" sync; isync;"
" mtpid %[ctx]; sync;"
" andi. %[tmp],%[count],3;"
" beq 211f;"
" andi. %[tmp],%[count],2;"
" beq 210f;"
" andi. %[tmp],%[count],1;"
" beq 201f;"
" b 200f;"
"211:stw %[tmp4],12(%[udaddr]);"
"210:stw %[tmp3],8(%[udaddr]);"
"201:stw %[tmp2],4(%[udaddr]);"
"200:stw %[tmp1],0(%[udaddr]);"
" subfic %[count], %[count], %[tmp];"
" bne 1b;"
Home |
Main Index |
Thread Index |
Old Index