Subject: What information belongs into man pages?
To: None <tech-userlevel@NetBSD.org>
From: Roland Illig <rillig@NetBSD.org>
List: tech-userlevel
Date: 10/18/2006 22:39:42
This is a multi-part message in MIME format.
--------------040700050207020506090005
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

the recent thread on the isspace(3) function motivated me to write an 
explanation about the topic, and I would like to include it in the 
ctype.3 man page, so that it looks like the appended one.

Now the questions are:
- What target audience do we have for the man pages?
   - C language lawyers (in that case it wouldn't be necessary)
   - Casual C programmers (who may find the information useful)
   - Beginners (to whom the information is essential)

- Do we want to provide such programmer-friendly documentation that 
tries to prevent bugs, increasing the code quality?

- Or are the man pages just references, and we don't care much about 
their quality?

Roland

--------------040700050207020506090005
Content-Type: text/plain;
 name="ctype.3"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ctype.3"

.\"	$NetBSD: ctype.3,v 1.14 2006/02/04 18:47:31 wiz Exp $
.\"
.\" Copyright (c) 1991 Regents of the University of California.
.\" All rights reserved.
.\"
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"    notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"    notice, this list of conditions and the following disclaimer in the
.\"    documentation and/or other materials provided with the distribution.
.\" 3. Neither the name of the University nor the names of its contributors
.\"    may be used to endorse or promote products derived from this software
.\"    without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\"     @(#)ctype.3	6.5 (Berkeley) 4/19/91
.\"
.Dd October 18, 2006
.Dt CTYPE 3
.Os
.Sh NAME
.Nm isalpha ,
.Nm isupper ,
.Nm islower ,
.Nm isdigit ,
.Nm isxdigit ,
.Nm isalnum ,
.Nm isspace ,
.Nm ispunct ,
.Nm isprint ,
.Nm isgraph ,
.Nm iscntrl ,
.Nm isblank ,
.Nm isascii ,
.Nm toupper ,
.Nm tolower ,
.Nm toascii
.Nd character classification and mapping functions
.Sh LIBRARY
.Lb libc
.Sh SYNOPSIS
.In ctype.h
.Fa int c
.br
.Fn isalpha c
.Fn isupper c
.Fn islower c
.Fn isdigit c
.Fn isxdigit c
.Fn isalnum c
.Fn isspace c
.Fn ispunct c
.Fn isprint c
.Fn isgraph c
.Fn iscntrl c
.Fn isblank c
.Fn isascii c
.Fn toupper c
.Fn tolower c
.Fn toascii c
.Sh DESCRIPTION
The above functions perform character tests and conversions on the integer
.Ar c .
.Pp
See the specific manual pages for more information.
.Sh CAVEATS
The first parameter of these functions is of type int, but only a very
restricted subset is actually valid as an argument.
The argument must either be the value of the macro
.Dv EOF
(which is typically -1)
or representable as an unsigned char, which is typically equal to the
range of [0, 255].
Passing an expression of type
.Li char
as the argument is wrong, because an expression of type
.Li char
may have a value that is
.Em not
representable as an
.Li unsigned char .
This happens in environments where
.Li char
is defined to be the same as
.Li signed char ,
rather than being the same as
.Li unsigned char .
.Pp
The following code illustrates this, taking
.Fn toupper
as an example for any of the above functions.
.Bd -literal
const char *s = ...;
while (*s != '\\0') {
    /* wrong 1: */ toupper(*s);
    /* wrong 2: */ toupper((int)*s);
    /* wrong 3: */ toupper((unsigned)*s);
    /* correct: */ toupper((unsigned char)*s);
    s++;
}
.Ed
.Pp
The first form is wrong because the value of
.Li *s
may become negative in environments where
.Li char
is a signed data type.
.Pp
The second form is equally wrong because when converting a char to an int,
the negative numbers are preserved.
.Pp
The third form is wrong, too, because converting a small negative number
to an
.Li unsigned int
results in a large unsigned number, as described in the C standard,
section 6.3.1.3, paragraph 2.
.Pp
The fourth form is correct because the character is first converted to an
.Li unsigned char ,
which always results in a valid argument for the
.Fn toupper
function.
After that, it is converted implicitly to an int, preserving the value.
.Sh SEE ALSO
.Xr isalnum 3 ,
.Xr isalpha 3 ,
.Xr isascii 3 ,
.Xr isblank 3 ,
.Xr iscntrl 3 ,
.Xr isdigit 3 ,
.Xr isgraph 3 ,
.Xr islower 3 ,
.Xr isprint 3 ,
.Xr ispunct 3 ,
.Xr isspace 3 ,
.Xr isupper 3 ,
.Xr isxdigit 3 ,
.Xr toascii 3 ,
.Xr tolower 3 ,
.Xr toupper 3 ,
.Xr ascii 7
.Sh STANDARDS
These functions, with the exception of
.Fn isblank ,
.Fn isascii ,
and
.Fn toascii ,
conform to
.St -ansiC .

--------------040700050207020506090005--