NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/47840: awk string comparison of integer constant



The following reply was made to PR bin/47840; it has been noted by GNATS.

From: Aleksey Cheusov <cheusov%tut.by@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/47840: awk string comparison of integer constant
Date: Mon, 20 May 2013 15:10:54 +0300

 --001a11c33a5282864004dd253969
 Content-Type: text/plain; charset=ISO-8859-1
 
 On Mon, May 20, 2013 at 8:30 AM, <dholland%eecs.harvard.edu@localhost> wrote:
 
 > >Description:
 >
 > Observe the following curious behavior:
 >
 > macaran% jot 15 1 | awk '{ a[$1] = ($1 < 10); } END { for (k in a) { print
 > k, a[k], (k < 10); }}'
 > 2 1 0
 > 3 1 0
 > 4 1 0
 > 5 1 0
 > 6 1 0
 > 7 1 0
 > 8 1 0
 > 9 1 0
 > 10 0 0
 > 11 0 0
 > 12 0 0
 > 13 0 0
 > 14 0 0
 > 15 0 0
 > 1 1 1
 >
 > Note that k < 10 is evaluated as a string comparison.
 >
 > Is this required by some standard? gawk does the same thing, but it
 > definitely violates the POLA.
 >
 
 POSIX says the following
 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
 
 "Comparisons (with the '<', "<=", "!=", "==", '>', and ">=" operators)
 shall be made numerically if both operands are  numeric, if one is numeric
 and the other has a string value that is a numeric string, or if one is
 numeric and the other has the uninitialized value. Otherwise, operands
 shall be converted to strings as required and a string comparison shall be
 made using the locale-specific collation sequence."
 
 Unless I read this sentence incorrectly the second and third columns in
 your output
 should contain the same values because in both contexts 10 has definitely a
 type "numeric"
 and therefore both k and $1 should be converted to the number before
 comparison.
 
 So, I think nawk violates POSIX. On the other hand mawk, gawk and Solaris'
 xpg4/awk work the same way.
 
 --001a11c33a5282864004dd253969
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable
 
 <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On M=
 on, May 20, 2013 at 8:30 AM,  <span dir=3D"ltr">&lt;<a href=3D"mailto:dholl=
 and%eecs.harvard.edu@localhost" 
target=3D"_blank">dholland%eecs.harvard.edu@localhost</a>&gt;</=
 span> wrote:<br>
 <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
 left:1px solid rgb(204,204,204);padding-left:1ex">&gt;Description:<br>
 <br>
 Observe the following curious behavior:<br>
 <br>
 macaran% jot 15 1 | awk &#39;{ a[$1] =3D ($1 &lt; 10); } END { for (k in a)=
  { print k, a[k], (k &lt; 10); }}&#39;<br>
 2 1 0<br>
 3 1 0<br>
 4 1 0<br>
 5 1 0<br>
 6 1 0<br>
 7 1 0<br>
 8 1 0<br>
 9 1 0<br>
 10 0 0<br>
 11 0 0<br>
 12 0 0<br>
 13 0 0<br>
 14 0 0<br>
 15 0 0<br>
 1 1 1<br>
 <br>
 Note that k &lt; 10 is evaluated as a string comparison.<br>
 <br>
 Is this required by some standard? gawk does the same thing, but it<br>
 definitely violates the POLA.<br></blockquote></div><br></div><div class=3D=
 "gmail_extra">POSIX says the following<br><a href=3D"http://pubs.opengroup.=
 org/onlinepubs/9699919799/utilities/awk.html">http://pubs.opengroup.org/onl=
 inepubs/9699919799/utilities/awk.html</a><br>
 <br>&quot;Comparisons (with the <tt>&#39;&lt;&#39;</tt>, <tt>&quot;&lt;=3D&=
 quot;</tt>, <tt>&quot;!=3D&quot;</tt>, <tt>&quot;=3D=3D&quot;</tt>, <tt>&#3=
 9;&gt;&#39;</tt>, and
 <tt>&quot;&gt;=3D&quot;</tt> operators) shall be made numerically if both o=
 perands are=A0 numeric, if one is numeric and the other has a string
 value that is a numeric string, or if one is numeric and the other has the =
 uninitialized value. Otherwise, operands shall be
 converted to strings as required and a string comparison shall be made usin=
 g the locale-specific collation sequence.&quot;<br><br></div><div class=3D"=
 gmail_extra">Unless I read this sentence incorrectly the second and third c=
 olumns in your output<br>
 should contain the same values because in both contexts 10 has definitely a=
  type &quot;numeric&quot;<br>and therefore both k and $1 should be converte=
 d to the number before comparison.<br></div><div class=3D"gmail_extra"><br>
 So, I think nawk violates POSIX. On the other hand mawk, gawk and Solaris&#=
 39; xpg4/awk work the same way.<br><br></div></div>
 
 --001a11c33a5282864004dd253969--
 



Home | Main Index | Thread Index | Old Index