Subject: Re: Kernel copyin/out optimizations for ARM...
To: <>
From: David Laight <david@l8s.co.uk>
List: port-arm
Date: 03/15/2002 16:07:25
On Fri, Mar 15, 2002 at 10:03:17AM +0000, Richard Earnshaw wrote:
> > Looks pretty good, though I haven't tried it.
>
> However, I wouldn't recommend the use of swp except when a locked transfer
> is really needed -- it can have nasty cache implications.
Since the weather here is dreaery and damp....
I've done some local optimisations:
- Removed the swp
- filled many of the delay slots
- removed the 16byte align code from kcopy
All 3 routines seem to work as copy routines, but my ARM system
doesn't run netBSD so I can't test the fault handling.
I have a slight doubt over the copyout code:
ldmia r0!, {r4, r5, r6, r14}
strt r4, [r1], #4 /* need user perms here... */
stmia r1!, {r5, r6, r14} /* ... kernel ones ok here */
Now r1 is 16 byte aligned so that the strt and stmia are
guaranteed to be in the same page - so it is basiclaay sound.
However if the page gets set 'copy on write' between the strt and
stmia then memory will get corrupted.
For this to happen I think you need (at least) kernel threads (so
a different thread can call exec) and either in kernel preemption
or a multi-cpu system (to get anything else running at all).
Jason - is this a real possibility anytime in the next 10 years?
The alternative is to replace the stmia with 3 strt instructions.
This is slightly slower, but since the alignment code can be removed
will be shorter for small transfers. I've coded both versions,
defining DONT_USE_LDM_USER will cause the strt (and ldrt in copyin)
instructions to be used.
Might be worth running a benchmark test (of something that does
moderate length copyin/out) to see how much effect it has.
Anyway new file on www.l8s.co.uk then netbsd/bcopyinout.S
David
--
David Laight: david@l8s.co.uk