tech-misc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: alx-0008 - Standardize strtoi(3) and strtou(3) from NetBSD



Hi Paul,

On Thu, Mar 20, 2025 at 12:03:24AM -0700, Paul Eggert wrote:
> > How do you get a uintmax_t?
> 
> With a different function a2u that comes with a different struct (just like
> strtoi/strtou).

That's a problem.  For fundamental types it's clear what to use, but
say I'm parsing a time_t; I need a macro that calls the signed or the
unsigned variant as needed.

And if I need to declare a variable of a structure type where the
output will be written, I need to know which type it will be.  A union
could work, but then I need to know which member of the union I should
read.  It's not that easy.

> > Also, how do I perform range checks in that call?  I need to specify
> > min and max limits.
> 
> The caller should do the range checks, as this makes C code easier to
> follow, both by the human programmer and the compiler, which can more easily
> use the range information to generate better code.
> 
> For cases like these having a function do the range checks is often a
> mistake - it doesn't significantly increase reliability (on the contrary, it
> can reduce it). But if you prefer a function with range checks, it's easy to
> add bounds arguments.

That's where I wanted to arrive.  That's not true in this case.  I found
several bugs in groff in range checks after strtol(3), because it's
really hard to check for them.  Consider the following code, and focus
only on the range checks (I'll ignore handling other errors).

	int min, max;

	...
	max = cond ? some_value : INT_MAX;
	...

	errno = 0;
	n = strtol(s, NULL, 0);
	if (n > max)
		goto too_high;

It looks reasonable, right?  Well, if max == INT_MAX == LONG_MAX, the
test above will never trigger.  The correct check is the following:


	errno = 0;
	n = strtol(s, NULL, 0);
	if (errno == ERANGE && n == LONG_MAX || n > max)
		goto too_high;
	if (errno == ERANGE && n == LONG_MIN || n < min)
		goto too_low;

Which not many programmers write (at least, I've seen this bug in
several projects.  Since this API does internally perform range checks
and clamping, it should do them fully.

	n = strtoi(s, NULL, 0, min, max, &status);
	if (status == ERANGE && n == max)
		goto too_high;
	if (status == ERANGE && n == min)
		goto too_low;

which is objectively simpler and less error-prone than with strtol(3),
and IMO, more readable.

> > How do you know how much has been parsed on error?
> 
> Good point. I guess we'll need three elements to the structure, then.
> 
> At this point we're bikeshedding but you can have the last word if you like.

Not really.  My point is that an old and time-tested API is better than
a new invention, unless there are significant flaws in it.  strtoi(3)
has passed the test of time, and is free of the many flaws of strtol(3).
I would have designed it differently, but I don't think our concerns are
enough to justify an invention that might be worse in retrospective.

And considering that strtol(3) is really broken, I'd go for strtoi(3).


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index