[CC += Paul] Hi Robert, On Wed, Jul 24, 2024 at 01:40:02AM GMT, Robert Elz wrote: > The following reply was made to PR lib/58461; it has been noted by GNATS. > > From: Robert Elz <kre%munnari.OZ.AU@localhost> > To: gnats-bugs%netbsd.org@localhost > Cc: > Subject: Re: lib/58461: strtoi(3) is not portable > Date: Wed, 24 Jul 2024 08:38:28 +0700 > > The supplied patch is incorrect for NetBSD (and anywhere) as > when EINVAL is returned from strtoimax() (in the POSIX mandatory case, > to indicate an invalid base, as opposed to the POSIX optional case, > to indicate that no digits were converted) the value of *endptr is > unspecified, and hence cannot be compared to anything (and as it > happens, on NetBSD it gets set to nptr and so expecting it to remain > at any initial value is never going to work). Ughh, while it is true that it is unspecified, I hadn't known any implementation that didn't leave it untouched. You're right that this patch won't work on NetBSD. This makes my fears come true, actually. I was wishing that I was wrong back then... I documented in the Linux man-pages that it is impossible to portably detect an invalid base _after_ a call to strtol(3), as you say. And thus it only makes sense to validate the base before the call, which makes the POSIX decision to report EINVAL completely bogus, because a caller will likely invoke UB if it reads *endptr afterwards (which will often happen). And so ISO C's choice of leaving the behavior undefined on an invalid base was actually better (POSIX probably only codified implementations's behavior; historic accidents are probably to be blamed for these discrepancies between strtol(3) implementations). I was recently hoping that I was wrong and that the only expectable consequence of POSIX choosing to report EINVAL was that it could be tested somehow. And indeed POSIX says something like that in APPLICATION USAGE. I see that FreeBSD, OpenBSD, and Bionic libc (Android) also set nptr on an invalid base. It seems glibc and musl are the weirdos here. :( My bad. Robert, would you mind opening a bug report for POSIX, requesting to clarify the APPLICATION USAGE paragraph in the following sense?: Since the value of *endptr is unspecified if the value of base is not supported, applications should either ensure that base has a supported value (0 or between 2 and 36) before the call, -or check for an [EINVAL] error before examining *endptr. +or check for an [EINVAL] error before examining *endptr +(but on systems that fail with EINVAL +when no conversion could be performed, +this is not an option). or alternatively, for simplicity: Since the value of *endptr is unspecified if the value of base is not supported, -applications should either ensure that base has a supported value -(0 or between 2 and 36) before the call, +applications should ensure that base has a supported value +(0 or between 2 and 36) before the call. -or check for an [EINVAL] error before examining *endptr. If you do, please add Reported-by: Alejandro Colomar <alx%kernel.org@localhost> (I don't report it myself because I've always had a lot of trouble following Austin's processes; I don't know how to report a bug there.) > I have a different patch, which avoids the POSIX mandatory EINVAL > completely, by simply validating base before calling strtoimax(), > hence an EINVAL we do get from that (which now will never happen on > NetBSD) must be the optional one, so if it happens, we will simply > ignore that, and continue as if errno==0 had been returned (or more > correctly for strtoimax(), as if errno had not been altered, which > is the same thing with the strtoi() code). Yeah, that was my original approach, which has the downside that if a system adds support for other bases (e.g., base 1, a.k.a., tally marks) strtoi(3) would need to be patched because it hardcoded the bases. But it is how it is, it seems. Have a lovely day! Alex -- <https://www.alejandro-colomar.es/>
Attachment:
signature.asc
Description: PGP signature