Subject: Re: Strange C compiler code generation
To: Peter Teichmann <teich-p@Rcs1.urz.tu-dresden.de>
From: Richard Earnshaw <rearnsha@buzzard.freeserve.co.uk>
List: port-arm32
Date: 02/25/2001 20:01:05
> Richard Earnshaw wrote:
> > This would change if we moved to padding structures up to a word; and it 
> > is what we were originally intending to do when moving to ELF. But that 
> > would make use less compatible with ARM's toolkit (and it also makes 
> > porting some applications a pain in the butt, since most other machines -- 
> > particularly x86 --  don't do this).
> 
> Is there some information somewhere available how structures should be padded
> and aligned on ARM processors? 

It's quite simple:  All fields in a structure are placed on their natural 
alignment boundaries unless they are marked as packed.  Normally, the only 
difference between NetBSD/arm32, RISC OS and ARM Linux is when a structure 
consists purely of fields of size 16-bits or less.  In RISC OS and ARM 
Linux these structures will always be word-aligned and be padded upto a 
32-bit boundary.  On NetBSD such structures will be 8- or 16-bit aligned 
depending on whether they contain just byte-sized fields or byte and 
short-sized fields.

The only time this tends to make a difference is when programmers don't 
understand the rules of C and build assumptions into their programs that 
for

struct x
{
  char a;
};

and similar examples sizeof(struct x) will return sum_over_fields(sizeof(fi
eld)) (ie 1 in the above case).

> I just know the behaviour of the Norcroft C
> compiler under RiscOS. There structures were always 32-Bit aligned, and inside
> a structure 32 Bit quantities were always 32-Bit aligned. I do not remember
> what was with 16 Bit quantities, but I guess they were 16-Bit aligned
> as Acorn suggested to use:
> 
> LDR reg,[adr] this uses the fact that the content of [adr] is rotated in a way
>               so that the LSB of reg contains that byte that is located at [adr]
> MOV reg,reg,lsl#16
> MOV reg,reg,lsr#16 for unsigned / MOV reg,reg,asr#16 for signed
> 
> for loading shorts.

Or ldrh on ARMv4!

> 
> Well, I am just interested in the speed of the code. You have seen the bad
> impact of unaligned and unpadded structures on the speed! 

Obviously.  The solution to this problem is to find out why those pragmas 
were put in, and if they aren't needed to get rid of them.

> 
> I would prefer structures only allowed to be 32-Bit aligned. I would like
> the structures to be padded, so that 32-Bit quantities are 32-Bit aligned, and
> 16-Bit quantities are 16-Bit aligned. It seems that Linux' GCC is doing this
> by default (and even if it is told not to do so...) How can I explain this to
> NetBSD's GCC, that I want this behaviour?

Using the always 32-bit-align-structures rule tends to break so many 
programs that ARM's latest compiler (Acorn's Norcroft was an early version 
of that) uses the same rule that NetBSD/arm32 does.  There is a fair 
chance (though it is yet to be decided) that NetBSD will stick with the 
current rules even though we had orignially planned to move to strict 
32-bit alignment when we switched to ELF.

> 
> What does ARM's toolkit do? 
It uses the same rule that NetBSD does.

> One should think that they prefer their code to be
> fast, and for the major range of applications were ARM processors are used this
> is the most important fact, compatibility to x86 structures is not important.

When you are trying to port a program like Mozilla, which is enormous and 
written by people who've never heard of the ARM and don't care about its 
non-x86 style alignment rules you'd probably change your mind -- I was 
tearing my hair out the other week trying to get this to run on ARM Linux 
without crashing on every second page visited  -- it doesn't help much 
that the debug data for Mozilla exceeds the VM of many machines!  And yes, 
it was an assumption about the size of struct {short x[3]};

> So they should do it the same way?

It's sill being discussed.  But as has been said before, the problem in 
your case is those pragmas.

R.

PS.  Norcroft C is almost certainly just ignoring pragma pack anyway.