Subject: Re: Bug in regex library?
To: None <current-users@NetBSD.ORG>
From: Peter Seebach <seebs@solon.com>
List: current-users
Date: 10/30/1996 08:13:02
>> I'll make a supported one; the integral types are defined to be the
>> character types, short int, int, long int, and unsigned (short, int,
>> long int).  The list is exhaustive.  (6.1.2.5.)

>Exhaustive in the sense that all derived types in the Standard which are
>of integral type must be one of those types.  Nothing prohibits a compiler
>from offering additional types, as long as the Standard's types are, well,
>standard.  (That "long long int" is a syntax error, pure and simple, does
>not change this, it merely means that it can't be used without a diagnostic
>when the compiler is in strict conformance mode.  The use of "__int64" in
>a user program is a user "error" which does not require a diagnostic, hence
>can be silently accepted if the compiler has made use of its liberty to
>supply a symbol in the reserved namespace.  (I'm sure Peter knows this,
>this is for everyone else's dubious benefit :-) ).

Yeah.  The difference between "long long" and "__int64" is that "long long"
*requires* a diagnostic, but "__int64" merely invokes undefined behavior.

The intent was that all integral types used by a compiler be taken from those
in the list; this includes, say, types used by POSIX, because POSIX defers to
C on some of these issues.

>off_t is not an ANSI feature, it is POSIX; it is therefore not required
>to be a listed ANSI integral type.

It is if POSIX says it's an integral type, because POSIX's spec is in
terms of C types, or so it says... Unfortunately, I haven't got a real
POSIX standard yet.  :(

In any event, "long long" breaks most of the guarantees and consistency of
the type system.  It's a real shame the vendors didn't just name it "quad_t"
and use that exclusively; then we'd have a reasonable chance, in C9X, of
having a clean type system.  As is, we have rules where in some cases, a
large positive number divided by a small negative number is (unsigned long) 0.
(The alternative was rules where code which worked under C89 suddenly breaks,
which was not seen as acceptable for this.)

And, of course, we lose all sorts of other things.  @!#*@*.  It would have
been a great thing if more vendors had just bitten the bullet and made long
grow.  (Of course, some code would run more slowly on 32 bit systems, but I'd
*happily* take a couple of years of mildly suboptimal performance on my home
computer if it would mean I won't have to put up with utter cruft in my
fave programming language for 25 years or more.)

-s