NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/47840: awk string comparison of integer constant
The following reply was made to PR bin/47840; it has been noted by GNATS.
From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: Re: bin/47840: awk string comparison of integer constant
Date: Tue, 21 May 2013 03:02:14 +0400
On Mon, May 20, 2013 at 05:30:01 +0000, dholland%eecs.harvard.edu@localhost
wrote:
> Observe the following curious behavior:
>
> macaran% jot 15 1 | awk '{ a[$1] = ($1 < 10); } END { for (k in a) { print
> k, a[k], (k < 10); }}'
> 2 1 0
[...]
>
> Note that k < 10 is evaluated as a string comparison.
>
> Is this required by some standard? gawk does the same thing, but it
> definitely violates the POLA.
Hmm, it does, indeed, but read the already mentioned
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
closer and pay attention to the definition of "numeric string".
Expressions in awk
[...]
A string value shall be considered a NUMERIC STRING if it comes from
one of the following:
1. Field variables
2. Input from the getline() function
3. FILENAME
4. ARGV array elements
5. ENVIRON array elements
6. Array elements created by the split() function
7. A command line variable assignment
8. Variable assignment from another numeric string variable
... Whether or not a string is a numeric string shall be relevant
only in contexts where that term is used in this section.
[...]
Comparisons (with the '<', "<=", "!=", "==", '>', and ">="
operators) shall be made numerically if both operands are numeric,
if one is numeric and the other has A STRING VALUE THAT IS A NUMERIC
STRING, or if one is numeric and the other has the uninitialized
value. Otherwise, operands shall be converted to strings as
required and a string comparison shall be made using the
locale-specific collation sequence.
So for (k in a) gives you k that is a string, but not a numeric
string(!), and so the compariosn is done on strings.
RATIONALE
[...]
The description for comparisons in the ISO POSIX-2:1993 standard
did not properly describe historical practice because of the way
numeric strings are compared as numbers. The current rules cause
the following code:
if (0 == "000")
print "strange, but true"
else
print "not true"
to do a numeric comparison, causing the if to succeed. It should
be intuitively obvious that this is incorrect behavior, and
indeed, no historical implementation of awk actually behaves this
way.
To fix this problem, the definition of numeric string was enhanced
to include only those values obtained from specific circumstances
(mostly external sources) where it is not possible to determine
unambiguously whether the value is intended to be a string or a
numeric.
Variables that are assigned to a numeric string shall also be
treated as a numeric string. (For example, the notion of a
numeric string can be propagated across assignments.)
-uwe
Home |
Main Index |
Thread Index |
Old Index