Subject: Re: Strange C compiler code generation
To: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
From: Peter Teichmann <teich-p@rcs.urz.tu-dresden.de>
List: port-arm32
Date: 02/26/2001 19:19:54
> It's quite simple:  All fields in a structure are placed on their natur=
al=20
> alignment boundaries unless they are marked as packed.  Normally, the o=
nly=20
> difference between NetBSD/arm32, RISC OS and ARM Linux is when a struct=
ure=20
> consists purely of fields of size 16-bits or less.  In RISC OS and ARM=20
> Linux these structures will always be word-aligned and be padded upto a=
=20
> 32-bit boundary.  On NetBSD such structures will be 8- or 16-bit aligne=
d=20
> depending on whether they contain just byte-sized fields or byte and=20
> short-sized fields.

This seems to be reasonable as it has no impact on the speed, if the stru=
cture
itself is assumed to be always 32-Bit aligned.

> Or ldrh on ARMv4!

But not on that stupid RiscPC I have, even if it has a SA processor!

> > Well, I am just interested in the speed of the code. You have seen th=
e bad
> > impact of unaligned and unpadded structures on the speed!=20
>=20
> Obviously.  The solution to this problem is to find out why those pragm=
as=20
> were put in, and if they aren't needed to get rid of them.

It seems to turn out that they are not necessary there, it was just easie=
r to
pack all structures, even those where it is not needed.

> Using the always 32-bit-align-structures rule tends to break so many=20
> programs that ARM's latest compiler (Acorn's Norcroft was an early vers=
ion=20
> of that) uses the same rule that NetBSD/arm32 does.  There is a fair=20
> chance (though it is yet to be decided) that NetBSD will stick with the=
=20
> current rules even though we had orignially planned to move to strict=20
> 32-bit alignment when we switched to ELF.

But the actual C compiler under NetBSD does by default (if there is no
#pragma pack(..) assumes that structures are 32-Bit aligned. BTW, is
#pragma pack ANSI C or is it some GCC extension?

Another question:

I found that GCC is using the possibility to conditionally execute all AR=
M
instructions only in very rare cases, and sometimes also does a bad job w=
ith
optimizing in some cases.

E.g. one can find:

MOV r1,#1
ORR r3,r3,r7,lsl r1

Ok, the r1 is also used later, but in case of the ORR instruction one cou=
ld
still save one cycle.

Also it is not able to compile

if (a>0x000000ff) {
  a >>=3D 8;
  n +=3D8;
}

and get:

CMP     a,#0x000000ff
MOVHI   a,a,lsr#8
ADDHI   n,n,#8

No, gcc needs to use a jump instruction!

Do you know a person who is involved in this?=20

Thank you very much for your help!
Peter
--=20
Email: teich-p@rcs.urz.tu-dresden.de   WWW: rcswww.urz.tu-dresden.de/~tei=
ch-p