tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: tolower()/islower() and char



In article <20210114111428.GA1437%antioche.eu.org@localhost>,
Manuel Bouyer  <bouyer%antioche.eu.org@localhost> wrote:
>Hello,
>In xentools, we have patches like
>-            if (tolower(*s) != tolower(*se))
>+            if (tolower((unsigned char)*s) != tolower((unsigned char)*se))
>
>(s and se being char*)
>
>This is to fix «array subscript has type 'char' [-Werror=char-subscripts]»
>
>I submitted this to Xen, and a developper asks:
>> Isn't this something that wants changing in your ctype.h instead?
>> the functions (or macros), as per the C standard, ought to accept
>> plain char aiui, and if they use the input as an array subscript,
>> it should be their implementation suitably converting type first.
>
>Any comment about this ? I'm not familiar with these details ...

Reply that the developer needs to "Read The Fine Manual".

This is not to fix the "array subscript has type 'char'". This
warning is a happy accident we maintain by design to indicate
incorrect code. The ctype(3) function argument domain documented in:

    https://pubs.opengroup.org/onlinepubs/9699919799/functions/isalpha.html

is stated as:

    The c argument is an int, the value of which the application
    shall ensure is representable as an unsigned char or equal to
    the value of the macro EOF. If the argument has any other value,
    the behavior is undefined.

Undefined behavior is bad. Passing "char" to these functions
can lead to undefined behavior. Casting to "unsigned char" in the
headers is not the right solution and it is spelled out above:
"the application shall ensure".

Casting to "unsigned char" is the simplest fix to avoid the issue.

Best,

christos



Home | Main Index | Thread Index | Old Index