tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: once again, some discussion about <ctype.h> interfaces....



    Date:        Mon, 28 Jan 2013 22:42:57 -0800
    From:        "Greg A. Woods" <woods%planix.ca@localhost>
    Message-ID:  <m1U04uH-004lSZC%more.weird.com@localhost>

Joerg already pointed out that your proposed macros evaluate their args
more than once, which isn't acceptable, but in addition to that ...

  | + * Note that these implementations could also allow removal of the
  | + * (_ctype_+1)[] trick,

Not really.   It is possible that sizeof(c) == 1 && c == EOF
(just "char c; c = EOF; isupper(c)" ...)  so you still need the "trick",
as that is (while non-portable for sure) a valid call on an architecture
with signed chars.

Also, is sizeof() defined when handed an expression, rather than a variable,
or type?

That is: isupper(c&0xFF) is perfectly valid, but I am not sure sizeof(c&0xFF)
means anything.

  | + * Obviously this doesn't help the real libc <ctype.h> functions

However the macros work, the functions must also work.

 -- sign
  | + * extension will still happen when an (signed char) parameter with a nega=
  | tive
  | + * value is passed to one of them, and then they will be stuck with, at
  | + * minimum, an in-ability to distinguish between 0xFF and -1 (EOF)

A signed char (at least with normal 8 bit char types) cannot hold 0xFF,
it is out of range, if it has that bit pattern, it is -1 and so is EOF.
There is no need to distinguish those cases - what must not be done is
to map EOF into 0xFF and treat it like that (which is why just wildly
inserting (unsigned char) casts around without thought - and particularly 
not in the macro definitions themselves, is wrong.)

  | + * However, since any potential EOF value should in theory always be dealt
  |  with
  | + * in such a way as to avoid ever calling any of the <ctype.h> functions

If you're of my vintage (and I think you probably are, or close enough)
then the correct calling sequence of all these functions is something along
the lines of
        if (isascii(c) && isupper(c)) c = tolower(c);
That is, one never calls the test macros without calling isascii() first,
and never calls the modification macros without testing the appropriateness
of  the arg first.

Unfortunately, needing to do that has been obsolete for a long time now,
so unless your legacy code is very legacy indeed, it probably doesn't
do any of that.

Given that, with code designed to assume chars are signed (which is a lot
of legacy code, given that pdp11's and vaxen both operated this way)
storing EOF in a char variable (c = getc()) and then testing it with the
ctype macros/functions was quite common.

I think it would be dangerous to assume that no-one ever tests values that
happen to be EOF.

Much better to get warnings about unsafe uses, and then actually go fix the
legacy code to work properly, than hide problems and introduce bugs.

  | + * Perhaps compilers would be smart to generate a warning whenever a narro=
  | wer
  | + * signed parameter will be sign extended (if negative) to widen it to mat=
  | ch
  | + * the prototype (or the default parameter conversions).

You mean when a short is widened to int, or int to long?   No thanks.
It is perfectly reasonable to use short values to hold signed integers
and pass them to functions expecting integer parameters, as long as the
range of the values is within the capacity of a short.


Just leave the macros alone, fix the ancient broken (or at least, non-portable)
code, and be done with it.

kre



Home | Main Index | Thread Index | Old Index