Subject: printf(3) portability proposal
To: None <tech-toolchain@netbsd.org>
From: Ross Harvey <ross@ghs.com>
List: tech-toolchain
Date: 07/05/1999 12:45:41
[ more of a draft proposal, actually, it's light on actual details ]

The ipv6 code is the latest example of portability problems involving printf
on size_t and ssize_t values, and the same general problem occurs with
pointers, off_t, and others.

(The problem, if anyone doesn't know, is that these are long or unsigned
long on some ports, int or unsigned int, or long or long long on others.
There is no printf specifier for size_t, et al, so they are hard to print
when -Werror is enabled.)

The historical fix we have done is to cast things up to the largest common
type that ILP32 or LP64 implementations define.  Because long long == long
on LP64, and long == int on ILP32, IIRC it never happens that code is
generated or that the object actually changes in any way other than its
type.

But the casts are kind of a hack which builds in knowledge of the present-day
intersection of the current architecture's type sizes, and it _could_
involve pointless code generation if long long ever became 128 bits somewhere.
They might also bomb in an unwise but conceivable non-ILP32, non-LP64
scenario. (Can't happen, you think?  Leave it to *#$&@!* microsoft to come
up with P64, which they call LLP64 to make it sound less awful.)

There are two alternatives:

1. C9X drafts contain new printf(3) format specifiers for size_t/ssize_t
   and ptrdiff_t/(equivalent unsigned integer).

2. C9X drafts contain format strings for various types. These are just
   conversion characters, so, e.g. in a c9x world, you could do:
	uintmax_t i = UINTMAX_MAX;
	printf("The largest int value is %020" PRIxMAX "\n", i);

In order to do #1, both the compiler and library must be hacked.  (I
volunteer, other volunteers welcome.) To do #2, we just need some new header
definitions. (The constant string concat feature is an C89 thing.) I would
propose we define format specifier strings for all the problematic system
types.

I believe both #1 and #2 should happen.  #1 is way more convenient, and
has to be done eventually, by someone, unless the features fall out of c9x.
#2 is more extensible and leverages standards instead of modifying them
with local printf extensions.

	Ross.Harvey@Computer.Org