Subject: Re: Types sizes in C
To: khorben <khorben@defora.org>
From: Eduardo Horvath <eeh@netbsd.org>
List: port-sparc64
Date: 04/10/2006 16:31:23
On Fri, 7 Apr 2006, khorben wrote:

> I have been recently surprised to see that the following C program has a
> different output on Solaris 10 and NetBSD 3, for the same hardware (and
> both while using gcc):

> ok, the only difference is actually the size of long integers,
> respectively 32 and 64 bits. From my reading of the ANSI C standard I
> understood this is possible.

The Solaris userland defaults to 32-bits, although you can compile and 
execute 64-bit binaries if you give the compiler the proper flags (unless 
you happen to have a 32-bit only version of gcc).  In fact if you look in 
/usr/bin you'll find that most if not all the programs in there are 32-bit 
binaries.  This was done because to build Solaris you use a compiler 
that's capable of generating both 32-bit and 64-bit code, or for some of 
the older versions like S7, two completely different compilers, and it 
provides binary compatibility with older 32-bit machines.

NetBSD's build environment isn't as flexible so all the userland is 
natively 64-bit and has 64-bit compilers.  32-bit compatibility is 
provided through the COMPAT_32 framework.  At the time GCC could not 
generate both 32-bit and 64-bit binaries from the same compiler.  
(What's worse, you couldn't even generate working 32-bit binaries from 
a compiler running on a 64-bit machine at one point.  Can you generate 
working 32-bit executables on sparc64 yet?)

Between that, the library path issue, and requiring kernel grovellers to 
be compiled to match the kernel, we decided not to do the same thing 
Solaris does.

> I have a few questions about this though, if appropriate:
> - who sets this?

This is an ABI issue.  The ELF specifications mandate the size of scalars, 
stack layout, alignment constraints, etc.

> - why not keep long as 32 bits and let int default to 64 bits instead?
>   This would help the short/long/long long hierarchy coherence, and let
>   int default to the native processor size (and 486 would be 16 bits!)

The C standard mandates  char <= short <= int <= long.  Now the former HAL 
computers, now part of Fujitsu, once had a SPARC V9 SVR4 operating system
where short was 16-bits and int, long, and pointers were all 64-bits wide.  
This had the advantage of letting broken code that assumed you could stuff 
a pointer in an int working, but had other issues, like actually accessing
32-bit wide datastructures read in from files and such.

> - has this ever been observed to cause portability or stability issues
>   on this platform?
> 
> My concern is, I have seen programmers assume sizeof(int) or
> sizeof(long) are 32 bits, even while writing portable 32/64 bits code
> (and I was myself wrong at first, since it is apparently very platform
> dependant even on 64 bits hardware).

Then it's not portable, is it?  The funny thing is that this sort of code 
often works fine on amd64 or Alpha machines since they are little-endian.
On a little-endian machine, as long as a datum fits within 32-bits, you 
will get the same value if you use a 64-bit load.  On big-endian machines 
like SPARC and POWERPC, if you do a 32-bit load of a 64-bit datum you get 
zero (or -1), since you only load the high bits, and very little data 
requires the high bits for significance.  We found more problems with 
allegedly 64-bit clean code trying to get sparc64 working.  Even large 
parts of NetBSD didn't work, although the website at the time said it's 
64-bit clean 'cause it runs on Alpha. 8^)

Eduardo