Subject: Re: toolchain/22118: make won't compile with -Wcast-qual -Wstrict-prototypes and more
To: None <woods@weird.com>
From: Ben Harris <bjh21@netbsd.org>
List: tech-toolchain
Date: 07/16/2003 23:32:07
In article <m19cqyL-000B44C@proven.weird.com> you write:
>[ On Wednesday, July 16, 2003 at 02:47:08 (-0400), der Mouse wrote: ]
>> Subject: Re: toolchain/22118: make won't compile with -Wcast-qual -Wstrict-prototypes and more
>>
>> >> Using `char foo[] = "string"' does de-const the string.
>> 
>> Not really; it was never const to begin with, because the double-quoted
>> text is not a string constant in the usual sense; it is an initializer,
>> an abbreviation for { 's', 't', 'r', 'i', 'n', 'g', '\0' } (modulo the
>> removal of the trailing '\0' if the string's length exactly matches the
>> declared size of the array).
>
>I beg to differ, as do Harbison and Steele, page 30, section 2.7.4.

ISO/IEC 9899:1990 (section 6.5.7) says:

# An array of character type may be initialized by a character string
# literal, optionally enclosed in braces.  Successive characters of the
# character string literal (including the terminating null character if
# there is room or if the array is of unknown size) initialize the elements
# of the array.

If this isn't clear enough, example 7 in that section says:

# The declaration
#       char s[] = "abc", t[3] = "abc";
# defines "plain" char array objects s and t whosee elements are initialized
# with character string literals.  This declaration is identical to
#       char s[] = { 'a', 'b', 'c', '\0' },
#            t[] = { 'a', 'b', 'c' };

which pretty clearly agrees with der Mouse.

>Secondly I read nothing in the above which implies in any way that the
>storage for the string constant used as an array initializer cannot ever
>be read-only.

From the point of view of the C program, initializers don't have storage at
all, since there's no way to take their addresses, and no way to assign to
them.  Only the things that they initialise have storage, and those are
writable if they're not declared "const".

>Indeed the other quote you made of K&R confirms that
>string constants may be read-only.

String constants used in expressions are read-only to the extent that
writing them produces undefined results (ISO/IEC 9899:1990 section 6.1.4).

>I suppose we still need to ask someone with a copy of the final ISO C
>standard,

I've got a copy of BS EN 29899:1993 Issue 2, which is equivalent to ISO/IEC
9899:1990 plus ISO/IEC Technical Corrigendum 1 : 1995.

>but I'm reasonably confident that GCC is allowing non-portable
>code to bypass 'const' warning when it de-const-ifies string constants
>used as array initializers.

As far as I can tell, ISO/IEC 9899:1990 (with TC1) only requires diagnostics
for "violation[s] of any syntax rule or constraint" (section 5.1.1.3).  I
can't find any relevant contstraints under sections 6.1.4 (string literals),
6.5.3 (type qualifiers), or 6.5.7 (initialization).

>Note also that there is no distinction in Standard C (though there seems
>to be in GCC's implementation) between a definition in the form "char
>foo[]" and a definition in the form "char *foo".  Both forms define a
>pointer to a char.

Neither is a definition, and without knowing what context they're in, it's
impossible for me to tell whether you're correct.  However, it's certainly
the case that they're not equivalent in all cases.  As ISO/IEC 9899:1990
says (example 2, section 6.5.4.2):

# Note the distinction between the declarations
#      extern int *x;
#      extern int y[];
# The first declares x to be a pointer to int; the second declares y to be
# an array of int of unspecified size (an incomplete type), the storage for
# which is declared elsewhere.

>An array name is merely treated as a pointer to the
>first element of the array

This is true only when it's being used in an expression other than as "the
operand of the sizeof operator or the unary & operator" (section 6.2.2.1).

>When that initializer is a
>string constant then the storage for the array is const-qualified since
>the string constant may be read-only and if the compiler does implement
>read-only string-constants then it _MUST_ warn that the implied "const"
>qualifier is being ignored in the initialization statement.  Also two
>separate char array variables which are initialized with an identical
>string constant _may_ point to the same storage address.

You're wrong on all counts, I'm afraid, at least as far as ISO/IEC 9899:1990
is concerned.

>> Note in particular that a string's type does not involve const; that is
>> a gccism, occuring only in the presence of -Wwrite-strings.
>
>Again, I beg to differ.  All sources I've found are quite explicit in
>saying that an implementation _MAY_ use read-only storage for string
>constants

True, though in ISO/IEC 9899:1990 this is phrased in terms of requirements
on programs:

# If the program attempts to modify a string literal of either form, the
# behaviour is undefined.

> and as such they must _always_ be const-qualified.

This isn't true.  A string literal forms an array of static storage duration
with elements of type char or wchar_t.  const isn't mentioned in the section
on string literals at all.

-- 
Ben Harris                                                   <bjh21@netbsd.org>
Portmaster, NetBSD/acorn26           <URL:http://www.netbsd.org/Ports/acorn26/>