Subject: Re: Many broken packages use Scheme -- weird!
To: Chris G. Demetriou <cgd@netbsd.org>
From: Richard Earnshaw <rearnsha@arm.com>
List: port-arm32
Date: 10/22/1998 12:48:42
> Todd Vierling <tv@pobox.com> writes:
> > If this is possible, it should be turned on unconditionally.  SPARC has
> > alignment faults, and if the "rolling" is how ARM actually does unaligned
> > access, it should have alignment faults too.
> 
> Probably not desirable.
> 
> For instance, last i saw, in many cases, for C like:
> 
> 	u_int16_t *wp = ...
> 	x = *wp;
> 
> the compiler will generate:
> 
> 	load 32 bits into reg
> 	reg & 0xffff
> 
> rather than a half-word instruction.  (similar things for byte
> instructions.)
> 
> May seem stupid, but not all ARM chips support half-word instructions,
> and in that case, the 'roll' behaviour is a lot better than requiring
> an extra shift...
> 
> 
> Basically, it's a part of the architecture.  Code which does the wrong
> thing is broken.  You shouldn't penalize code which does the right
> thing (and, as noted above, there are simple ways in which this really
> is the 'right thing; there are other more contrived ones, too, i'm
> sure 8-) for code which is broken.
> 

You beat me to it.

Basically, the ARMs prior to architecture version 4 did not have 
instructions to access half-word data directly.  However, a side effect of 
the implementation of the load byte instruction was that a load-word 
instruction would perform the same rotations, but without doing the 
masking.  This could be exploited to reduce the effort of loading a 
half-word.  Without it you have to do two byte loads and an or with a 
shift (which then leaves very little opportunity to optimize the sequence 
either with the address calculation (since the second load needs an 
additional offset of 1) or with subsequent uses of the result (since the 
ORR instruction already uses a shifted second argument).

All ARM ARM conforming MMUs can be configured to fault unaligned word 
accesses (and gcc can be told to generate code for this case), but I would 
strongly recommend that it is not enabled.  First, most existing NetBSD 
applications would cease to run, and second there is a serious performance 
hit when you compile code to be compatible with this -- it varies from 
application to application but it could easily be 10%.

Richard.

PS, compiling your code for Architecture v4 (possible if you use EGCS, but 
not with the gcc 2.7.2 derived compiler) will mean that the compiler will 
use ldrh and the like, so there should never be any unaligned word loads; 
but such code will not run on ARM6/7 processors, nor will it run on the 
RISC PC.