Subject: memcmp() optimisation on i386
To: None <tech-toolchain@netbsd.org>
From: David Laight <david@l8s.co.uk>
List: tech-toolchain
Date: 10/02/2002 21:49:34
I just noticed that gcc compiles memcmp into a 'repe cmpsb'
whenever the length isn't a constant.
The byte compare will be significantly slower than a 32bit
compare (especially is the pointers are aligned).
I don't profess to be a gcc wizard, but is it possible
to get something like:
movl %ecx,%edx
shrl $1,%ecx
shrl $1,%ecx
repe cmpsl
jne 1f
movl %edx,%ecx
andl $3,%ecx
repe cmpsb
1:
used instead? At least for checks for equality.
Alternatively an inlined function to do 'repe cmpsl'
could be used in the source when it is known that the
length to be compared is a multiple of 4.
David
--
David Laight: david@l8s.co.uk