NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: toupper and warnings



Johnny Billquist <bqt%update.uu.se@localhost> writes:

>> See CAVEATS in ctype(3).
>
> Right. But is gcc really smart enough to understand at compile time if
> something else than -1 is the negative value, and that toupper in fact
> is more limited than what the signature says?
>
> The *signature* of the function is int toupper(int). If you pass a
> char to that, I can't see that there would ever be a warning about any
> problems.

The signature is that it takes an int, but the specification is that if
the value of the int is other than EOF or something representable as
unsigned char (projecting to he implementation, meaning -1 is ok and
0..255 is ok), then you get UB.

The point of UB in specifications is to admit efficient implementations
(or to end arguments about existing implementations).  The NetBSD
implementation uses the value to index an array.   So if you have

   char c, d;
   c = getchar();
   d = toupper(c);

then it is entirely possible to cause an out-of-bounds read.  The
warning is coming from the code in the implementation of ctype -- which
is a macro -- that does this read.  In reading the code that leads to
the warning, it is easy to conclude by inspection that the calling code
is incorrect.  

> The fact that toupper in fact is only defined for -1..255 is hardly
> visible in the signature of the function.

True but I don't tnink that has much to do with this.   We have a
conforming implementation, and the compiler is pointing out a different
kind of UB that results from the function being invoked in a way that
causes UB.


To follow up, adding -Wsystem-headers brought the warning back.   clang
also warns.


To answer Greg Woods, It is not "conservative" to silently accept
UB-causing inputs and turn them into UB array reads.

Another approach would be for NetBSD to adust the code to bounds check
before indexing and do something if the value is out of bounds.  Options
are returning false, returning a random bit, calling abort(3), trying to
send all of the user's files to a random IP addreess, and so on.  All
are allowed by the spec :-)

Probably returning false with an envvar option to abort is best, similar
to how UB from mutex ops is handled.   When we started with abort it
seemed there was a vast amount of incorrect code written and debugged on
other operating systems that allow code specified to hav UB.

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index