Subject: Re: gcc optimizer bug in netbsd-1-6 on alpha (gcc 2.95.3 20010315 (release) (NetBSD nb3))
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Ian Lance Taylor <ian@airs.com>
List: tech-toolchain
Date: 08/16/2003 15:26:14
der Mouse <mouse@Rodents.Montreal.QC.CA> writes:

> I don't think so; I don't think the standard promises that there even
> _are_ memory locations in that sense.

It doesn't, but on a normal system the effect is more or less the
same.

> > ``A pointer to a union object, suitably converted, points to each of
> > its members, ... and vice versa.''  X3.159-1989 3.5.2.1.
> 
> Yes, but that "suitably converted" can hide unlimited amounts of
> implementation-specific magic.  In
> 	union { int32_t i; struct in_addr a; }
> there is nothing that guarantees that i and a share any storage at all,
> as far as I can see.  For example, i could be in a register and a could
> be in memory.  (And if a's address is taken but i's isn't, this would
> not even be unreasonable.)

The C standard is abstract, but it's not quite that abstract.
``Suitably converted'' here means using a type cast.  The definition
of casting a pointer requires that it be possible to cast back and
forth between suitably aligned pointers, which means that they must
have the same value or must have a simple conversion.  Note that this
is based only on the type of the pointer, not on the actual object
which the pointer holds the address of.

In other words, given
    union { int32_t i; struct in_addr a; } u;
the standard guarantees that
    &u.i == (int32_t *) &u.a
and further guarantees that
    foo (&u.i, &u.a)
will receive two pointers which compare identically after casting.

It's OK for the numeric value to change, as would be required on, say,
the DEC-20 for conversion from char * to int *, but it can only change
in a mechanical, reversible, fashion based only on the type of the
pointer.

Your suggestion that one field could be in a register while the other
was in memory would violate this requirement.  Such an optimization
would only be acceptable if there were no way to detect it, which
basically would require that only one field of the union be used.

> > In other words, a union may be used for type punning in a standard
> > conformant manner.
> 
> Not as I have understood "standard conformant"; as I understand the
> term, the code is not standard conformant the moment it stores into one
> member of a union and reads from another, regardless of why it is doing
> that, and the implementation is free to do anything in its power, from
> "working" to dumping core to, even, invoking the nasal demons (if
> suitably equipped).

Well, more precisely, the behaviour of reading from one field of a
union after writing to another is ``implementation defined,'' which
according to the standard means behaviour that ``depends on the
characteristics of the implementation and that each implementation
shall document.''  This is not the same as ``undefined behaviour,'' in
which the system may do as it pleases with no documentation
requirement.  (An example of undefined behaviour is dereferencing a
pointer which has an invalid value, such as to memory which has
already been passed to free().)

In other words, the same as with pre-standardization C: the
implementation should do something plausible.

> > Type punning itself is automatically implementation dependent.
> 
> Yes.  And the union is a clumsy but generally workable substitute for
> the pointer cast...and I don't think it helps the aliasing situation
> one bit, since struct in_addr cannot alias int32_t (if it could, you
> could just use the cast).  Thus, the compiler is permitted to assume
> that storing to the int32_t does not change the struct in_addr.

I don't think this is true.

> Whether today's gcc acts on that assumption I can't say, but there's no
> guarantee that tomorrow's won't.

I think that would violate the standard.  It would create a situation
in which a program would not be able to detect a change which it
should be able to detect.

For what it's worth, gcc's documentation clearly states that type
punning using a union is not affected by aliasing.  Search for
-fstrict-aliasing here:
    http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Optimize-Options.html#Optimize%20Options

Ian