Re: lib/34516: size_t should be equivalent to unsigned long

To: gnats-bugs%NetBSD.org@localhost
Subject: Re: lib/34516: size_t should be equivalent to unsigned long
From: David Holland <dholland-bugs%netbsd.org@localhost>
Date: Mon, 25 Feb 2008 02:17:09 +0000

On Sun, Jan 20, 2008 at 10:45:05AM +0000, Christian Biere wrote:
 >  David Holland wrote:
 >  > Subject: Re: lib/34516: size_t should be equivalent to unsigned long
 >  
 >  >  size_t is not, however, the *same* as unsigned long. Code that
 >  >  indiscriminately mixes size_t with unsigned long is no better than
 >  >  code that indiscriminately mixes size_t with unsigned int; it is not
 >  >  portable.
 >  
 >  Code which incorrectly uses unsigned long instead of size_t is
 >  comparatively rare and it's unlikely to happen by accident. Abuse of
 >  int instead of proper use of size_t is wide-spread and the result of
 >  implicit integer promotion is very often int.

But because int is signed, and size_t is unsigned, mixing them will
generate warnings or errors regardless of whether size_t is unsigned
int or unsigned long. So this is a red herring.

The real question is whether more people think they can properly mix
size_t with unsigned int or unsigned long, and if anything, I would
say far more people believe the latter. This is partly because it's
true (at least as far as size is concerned) on all common platforms
and partly because it's an explicit assumption and/or guarantee in
Linux-land.

Therefore, where the choice exists, making size_t unsigned int rather
than unsigned long will cause the compiler to reject, if anything,
*more* incorrect code rather than less.

 >  > C does not support newtypes, so there is no way to avoid masking some
 >  > such set of coding errors.
 >  
 >  The conclusion is incorrect.

Hardly. size_t *must* be the same as some unsigned integer type, and
there is therefore no way to persuade the compiler to distinguish it
from that specific type, whatever type that is.

That is, if I have a function foo(size_t *), and size_t is unsigned
long, I can write "unsigned long x; foo(&x);" and the compiler will
not complain. If size_t is unsigned int, I can write "unsigned x;
foo(&x);" and the compiler will not complain.

The best defense, if anything, is to compile your code on multiple
platforms with different definitions of size_t.

Note that mistakes like

   size_t somefunc(void);
   int x = somefunc();  /* loses data on 64-bit platforms */

can't be caught at all with gcc regardless of one's choice of
typedefs. (Catching these without generating tons of bogus warnings or
requiring lots of undesirable typecasts needs some fairly
sophisticated program analysis that's the domain of a program
verifier, not of a compiler.)

 >  I provided some examples in the initial report.

Your initial report contains code that mixes pointers to int and
pointers to size_t; this generates compiler diagnostics on any
conforming implementation.

Part of the problem appears to be that (some versions of?) gcc 3
silently allow mixing incompatible pointers. This is (was?) a fairly
serious gcc bug, but seems to no longer be an issue in gcc 4.

 >  >  Furthermore, the precise expansion of size_t is gcc's choice, because
 >  >  gcc "knows" the type signatures of various standard functions, so we
 >  >  can't or shouldn't change it in NetBSD even if there were a good
 >  >  argument in favor.
 >  
 >  It's not so much GCC's choice because other 32-bit targets don't have
 >  size_t as unsigned int. It's simply a question of editing machine/ansi.h.

It is gcc's choice, because gcc knows what the type signatures of
various standard functions involving size_t are, in terms of real
types. mrg says we can change what gcc believes easily enough, though,
and he'd know, so this part doesn't matter.

I remain unconvinced that there's any reason to change, and in fact
there seems to be some reason to think that keeping the current
behavior (size_t == unsigned int) is more desirable than changing
would be.

-- 
David A. Holland
dholland%netbsd.org@localhost

Follow-Ups:
- Re: lib/34516: size_t should be equivalent to unsigned long
  - From: Christos Zoulas

References:
- Re: lib/34516: size_t should be equivalent to unsigned long
  - From: Christian Biere

Prev by Date: Re: kern/38075: wm(4) can not receive arp request
Next by Date: Re: lib/34516: size_t should be equivalent to unsigned long
Previous by Thread: Re: lib/34516: size_t should be equivalent to unsigned long
Next by Thread: Re: lib/34516: size_t should be equivalent to unsigned long
Indexes:

Home | Main Index | Thread Index | Old Index