Subject: Re: lib/19638: isalpha (3) bug
To: Mike Cheponis <mac@Wireless.Com>
From: Dave Sainty <dave@dtsp.co.nz>
List: netbsd-bugs
Date: 01/03/2003 19:56:16
Mike Cheponis writes:

> > Ah, it must have been updated at some stage more recently.
> 
> Ahhh, OK, but what automated script do I run to get notification that
> man pages have been updated?  I run the auto security update check at
> night, for example...

You can join the source-changes mailing list if you want to track
changes at this level.

> Thanks, I looked at the code and figured out it was a cleverly encoded table
> that has bits set if the code has certain properties (isdigit, isupper,etc).
> 
> HOWEVER, the behavior on NetBSD is still wrong, I maintain.  Here's my
> argument:
> 
> Here's the program again:
> 
> #include <stdio.h>
> #include <stdlib.h>
> #include <ctype.h>
> 
> int main(){  int v,c;
> 
>   for (c=0; c <= 0x7ffffff; c++)    v = isalpha(c);
> 
>   return 0;
> }

This is hardly a real-world example though!

> 1)  Here are the OSs that this program works without a problem:
> 
> o Digital UNIX V4.0B  (Rev. 564); Tue Dec 14 15:43:30 EST 1999
> o FreeBSD 4.2-RELEASE #2: Sun Mar  4 12:11:05 PST 2001
> o BSDI BSD/OS 4.0 Kernel #6: Thu Jan 21 12:47:23 PST 1999
> 
> I had access to these machines, nothing else special about them.
> 
> 2) Here are the OSs with a problem:
> 
> o NetBSD 1.6
> o Linux version 2.2.14-12smp (root@porky.devel.redhat.com)
>    (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release))
>    #1 SMP Tue Apr 25 12:58:06 EDT 2000
> o Linux 2.0.x - (quite old Linux Suse release)
> o Minix 2.0.3
>
> ==> Argument First: NetBSD's behavior should be like the BSDs, not like
>     Linux.

I think you'll find that they are all behaving the same.  That is,
they are all behaving in an undefined manner.  For some, that doesn't
involve crashing (but not crashing is arguably a less appropriate
response than crashing).

> ==> Argument Second: Although I'm all for "programming by contract" as a
>     programming dicipline, I believe that we need to treat libc as meeting
>     a "higher standard".
>
>     In particular, libc should not violate the Principal of Least
>     Astonishment, which it clearly does here.

Surely it isn't that astonishing that passing arbitrary values in
where a character was expected might cause the program to fail...
That just isn't clear at all to me.

Absolutely, it should be documented, and it was an ommission in your
cut of the manual page that it didn't mention the argument limits.
But that's been corrected now...

>     Feeding a routine in libc a perfectly valid int should NOT cause the
>     libc routine to segfault.  That is Bad.

isalpha(c) doesn't take an integer, it takes a character or EOF.  You
aren't feeding in perfectly valid characters...

> ==> Argument Third: I don't care if I see bat-out-of-hell performance if
>     the result is wrong.  (The suggested mod, above, I believe would not
>     noticeably slow down any but the most pathological programs.)

The result isn't wrong, it's perfectly valid.  You'll see the same
result in many C library interfaces (try passing "perfectly valid
pointers" into fread() and your program will crash too).

Only pathological programs are going to trip over this, and in every
case that program is buggy and needs to be fixed.  You're abusing the
interface.  On a considerable proportion of systems that will result
in a crash for the ctype.h functions.

Cheers,

Dave