Subject: Re: lib/19638: isalpha (3) bug
To: Dave Sainty , <martin@duskware.de>
From: Mike Cheponis <mac@Wireless.Com>
List: netbsd-bugs
Date: 01/02/2003 21:44:04
On Fri, 3 Jan 2003, Dave Sainty wrote:

> > Interesting.  Here is my man page:
>
> Ah, it must have been updated at some stage more recently.

Ahhh, OK, but what automated script do I run to get notification that
man pages have been updated?  I run the auto security update check at
night, for example...


> > Still, it seems a gross bug to take an "int" argument and then segfault
> > when the routine sees an argument it doesn't like.
> >
> > It's not the "NetBSD Way".


> It might help to know how it's implemented.  These functions all
> reference directly into an array in memory, sized to handle the range
> -1 .. 255.  So it isn't a coding error in isalpha() that causes it to
> seg fault, it simply relies on the calling program to ensure that the
> argument will not fall off the top or the bottom of the array.
>
> Usually the input stream is 8-bit anyway, so in typical code (so long
> as the value is correctly cast as unsigned), it's impossible to exceed
> the acceptable range anyway.

Thanks, I looked at the code and figured out it was a cleverly encoded table
that has bits set if the code has certain properties (isdigit, isupper,etc).

HOWEVER, the behavior on NetBSD is still wrong, I maintain.  Here's my
argument:

Here's the program again:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

int main(){  int v,c;

  for (c=0; c <= 0x7ffffff; c++)    v = isalpha(c);

  return 0;
}


1)  Here are the OSs that this program works without a problem:


o Digital UNIX V4.0B  (Rev. 564); Tue Dec 14 15:43:30 EST 1999
o FreeBSD 4.2-RELEASE #2: Sun Mar  4 12:11:05 PST 2001
o BSDI BSD/OS 4.0 Kernel #6: Thu Jan 21 12:47:23 PST 1999

I had access to these machines, nothing else special about them.

2) Here are the OSs with a problem:

o NetBSD 1.6
o Linux version 2.2.14-12smp (root@porky.devel.redhat.com)
   (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release))
   #1 SMP Tue Apr 25 12:58:06 EDT 2000
o Linux 2.0.x - (quite old Linux Suse release)
o Minix 2.0.3


==> Argument First: NetBSD's behavior should be like the BSDs, not like
    Linux.

==> Argument Second: Although I'm all for "programming by contract" as a
    programming dicipline, I believe that we need to treat libc as meeting
    a "higher standard".

    In particular, libc should not violate the Principal of Least Astonishment,
    which it clearly does here.

    Feeding a routine in libc a perfectly valid int should NOT cause the
    libc routine to segfault.  That is Bad.


Sugested fix:

Add this c-equivalent line to   isalpha(c):

  if( c < 0 || c > 0xff)return 0;  // Cannot be alphabetic
  // this  "if" could  also be nicely encoded as a bit test against 0xffffff00

  // Else, array bound is OK, index into our 256-byte array


==> Argument Third: I don't care if I see bat-out-of-hell performance if
    the result is wrong.  (The suggested mod, above, I believe would not
    noticeably slow down any but the most pathological programs.)


Thanks again,

-Mike