tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] Fix printf(1) for integer larger than INTMAX_MAX



On 2020/10/25 0:26, Robert Elz wrote:
     Date:        Sat, 24 Oct 2020 21:40:44 +0900
     From:        Rin Okuyama <rokuyama.rk%gmail.com@localhost>
     Message-ID:  <e1dae284-3666-c343-ef93-be463a6eec32%gmail.com@localhost>

   | However, this result apparently depends on width of intmax_t. Is this
   | behavior acceptable by POSIX?

Two things:

This from POSIX (actually from a draft that will, perhaps modified, become
the next revision of the standard):

	If an argument operand cannot be completely converted into an
	internal value appropriate to the corresponding conversion
	specification, a diagnostic message shall be written to standard
	error and the utility shall not exit with a zero exit status, but
	shall continue processing any remaining operands and shall write
	the value accumulated at the time the error was detected to standard
	output.

and, on the same point, somewhat later:

	If an argument cannot be parsed correctly for the corresponding
	conversion specification, the printf utility is required to report
	an error. Thus, overflow and extraneous characters at the end
	of an argument being used for a numeric conversion shall be
	reported as errors.

0xffffc00000000000 is too big for an intmax_t data type, it overflows.
It is a large positive number, not a negative one.

You can print it using %u or %x (even %o if you want), but not, without
an error message, using %d.

Second, I can find nothing in POSIX (which doesn't mean it isn't there,
it is a large doc) which even hints at what range of integers need to be
supported by printf(1).   It may even be that strictly we're only supposed
to be able to print an "int" (not a long int, not a long long int, or any
other variety) as the formats for printf(1) don't include any of the
length modifiers that printf(3) supports, all it can print are (signed
or unsigned) integers.   Perhaps anything bigger than 2^31-1 should report
an error.

   | I think intmax_t is 64bit-width for all platforms that we currently
   | support. But, test cases (included in the patch above) can assume this?

You shouldn't, we'll get something with 128 bit intmax_t's one day.

   | If not, how can we obtain sizeof(intmax_t) from portable shell scripts?

No way I'm aware of, I'm not even sure that giving a huge integer to
printf, ignoring the error, and looking at what is printed works to
discover what printf can handle.   Current implementations seem to do
that, set the value to the max possible on overflow, but that might
actually be violating the standard, if we use

	printf '%d\n' 0xffffc0000000000

we get

	1152917106560335872

when one more 0 is added, we get overflow - a plausible reading of the
standard could be that when that overflow happens, the "value accumulated
at the time the error was detected' could be regarded as that one (115....)
and not 2^63-1 which is what we actually report.   (The 115... value is
what zsh produces, everything else I tested gives the same as us, but for
many of the shells that's not surprising, as they don't all have printf
built in. and are simply executing our /usr/bin/printf).

kre

ps: I will see if I can elicit any answers to the missing parts.   And if you
really want:
	$ printf '%d\n' 0xffffc00000000000
	-70368744177664
use ksh93, that's what it does, but it is unquestionably broken.

I think I understand very well! Thank you very much for kind explanation!

I forgot to mention in the previous message: %[oux] also fail for numbers
larger than 1<<63:

$ printf '%x\n' 0xffffc00000000000
printf: 0xffffc00000000000: Result too large or too small
7fffffffffffffff

This is because we convert unsigned integers by strtoimax(3). So, I've
updated the patch:

http://www.netbsd.org/~rin/printf_20201025.patch

(1) %[oux] works for numbers larger than 1<<63
(2) %d fails with ERANGE for numbers larger than 1<<63
    (when sizeof(intmax_t) == 8)
(3) do not assume sizeof(intmax_t) == 8 in ATF tests

Because of (3), (2) is not currently tested in ATF (commented out).

Thanks,
rin


Home | Main Index | Thread Index | Old Index