Subject: Re: bin/21645: Localized comments and indent(1)
To: None <netbsd-bugs@netbsd.org>
From: David Laight <david@l8s.co.uk>
List: netbsd-bugs
Date: 06/04/2003 19:53:22
> >  > >Synopsis:       indent(1) doesn't handle non English characters in
> > comments
> > 
> >  
> >  > -		if (*buf_ptr > 040 && *buf_ptr != '*')
> >  > +		if (!iscntrl(*buf_ptr) && *buf_ptr != '*')
> >  
> >  This, and other parts of this patch, are not correct because *buf_ptr
> >  is a signed char but the domain of iscntrl() (and the other isxxx()
> >  functions) is -1..255 not -128..127
> 
> Hmm... This is correct, but all isXXXX() functions declared at
> include/ctype.h as C preprocessor definitions in the following
> manner:
> 
>     #define isXXXX(c)  ((int)((_ctype_ + 1)[(int)(c)] & (CTYPE_FLAGS)))
> 
> As you can see, the given argument will be expclicitly autoconverted to
> signed int, which is enough.

The (int) cast only gets rid of a warning genberated by some compilers.

The problem is that if 'c' is a signed char, then it is promoted to
a signed integer using by preserving the sign.  This does not DTRT
for any of the isXXX() functions.

Basically almost every use of these functions is broken - unless
specific (and sometimes ugly) steps are taken.  Options include:
	isXXX((unsigned char)c)
	isXXX(*(unsigned char *)cp)
Trying to make the buffer 'unsigned char' usually has wider consequences
that are not easily overcome.

It is, perhaps, unfortunate that the pdp11 sign extended byte reads
into its 16bit registers...

	David

-- 
David Laight: david@l8s.co.uk