Re: char==int? [was Re: using the interfaces in ctype.h]

To: tech-userlevel%NetBSD.org@localhost
Subject: Re: char==int? [was Re: using the interfaces in ctype.h]
From: Alan Barrett <apb%cequrux.com@localhost>
Date: Tue, 22 Apr 2008 09:24:10 +0200

On Mon, 21 Apr 2008, der Mouse wrote:
> >>>   #define _CTYPE_MASK     ~(UINT_MAX << CHAR_BIT)
> >> Hoho, if sizeof (char) == sizeof (int).
> > I believe that that's impossible.  The C standard doesn't explicitly
> > say it's impossible, but the requirement that gets() be able to
> > return EOF as well as returning any possible character strongly
> > implies that sizeof(int) > sizeof(char).

(When I said gets(), I meant getc().  There are also other functions
with a similar interface.)

> (1) What about freestanding implementations?

Good point.  I suppose my reasoning doesn't apply to them.

> (2) Even for hosted implementations, must getc() (I assume you meant
> getc(), not gets()) be able to return any value a char - well, an
> unsigned char - can assume?  (Basically, may the execution character
> set be smaller than the set of values an unsigned char can assume?)

Hmm, I think you have something there.  The implication that char must
be smaller than int always seemed weak to me, being suggested only as a
side effect of the definition of these library functions.  It's quite
possible that there is no such requirement.

Anybody can search for "implementation character set" in the
C99 standard.  (The "N1124" draft is available from
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf>.)

> (3) Are there constraints on how UCHAR_MAX compares to
> CHAR_MAX-CHAR_MIN?  I'm wondering about non-binary machines.

(I assume you meant SCHAR_MAX-SCHAR_MIN, so we are comparing the range
of unsigned char with the range of signed char.)

I believe that section 6.2.5 paragraph 2 allows any signed integer type
to have a range that is much smaller than the range of the corresponding
unsigned type, through "padding bits".  Section 6.2.6.1 paragraph 3
required that unsigned char must not have any padding bits.  I don't
see any prohibition against padding bits for signed char.  So it seems
to me that it would be permissible for an implementation to have 32-bit
unsigned chars, and 8-bit signed chars (with 7 value bits, 1 sign bit,
and 24 padding bits, so that signed and unsigned char use the same
amount of storage, as required by section 6.2.5 paragraph 6).

C89 left the representation of signed types to be
implementation-defined, so something like BCD would have been allowed.
C99 section 6.2.6.2 paragraph 2 requires one of three representations
for signed integer types (and signed char is a signed integer
type): sign and magnitude, two's complement, or one's complement.
Even in a system with 8-bit chars and no padding bits, there are two
possibilities for SCHAR_MIN and SCHAR_MAX:  -128 to +127 in a two's
complement system; or -127 to +127 in a one's complement or sign and
magnitude system.

--apb (Alan Barrett)

References:
- Re: using the interfaces in ctype.h
  - From: Christos Zoulas
- Re: using the interfaces in ctype.h
  - From: Greg A. Woods; Planix, Inc.
- Re: using the interfaces in ctype.h
  - From: Terry Moore
- Re: using the interfaces in ctype.h
  - From: Greg A. Woods; Planix, Inc.
- Re: using the interfaces in ctype.h
  - From: Alan Barrett
- Re: using the interfaces in ctype.h
  - From: Greg A. Woods; Planix, Inc.
- Re: using the interfaces in ctype.h
  - From: Neil Booth
- Re: using the interfaces in ctype.h
  - From: Alan Barrett
- char==int? [was Re: using the interfaces in ctype.h]
  - From: der Mouse

Prev by Date: Re: using the interfaces in ctype.h
Next by Date: Re: using the interfaces in ctype.h
Previous by Thread: char==int? [was Re: using the interfaces in ctype.h]
Next by Thread: Re: using the interfaces in ctype.h
Indexes:

Home | Main Index | Thread Index | Old Index