NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/42463: Bizarre behavior in awk with invalid numeric constants



>Number:         42463
>Category:       bin
>Synopsis:       Bizarre behavior in awk with invalid numeric constants
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 16 21:10:00 +0000 2009
>Originator:     David A. Holland
>Release:        NetBSD 5.99.22 (20091208)
>Organization:
>Environment:
System: NetBSD tanaqui 5.99.22 NetBSD 5.99.22 (TANAQUI) #31: Tue Dec 8 22:53:35 
EST 2009 dholland@tanaqui:/usr/src/sys/arch/i386/compile/TANAQUI i386
Architecture: i386
Machine: i386
>Description:

awk does bizarrely random things when you write invalid numbers in the
program text.

This is not so surprising, although one would expect it to generate a
syntax error (recall that awk doesn't handle hex integer constants...)

   % awk </dev/null 'END { printf "%d\n", 0xblegh }'
   0

This, however, is very strange:

   % awk </dev/null 'END { printf "%c\n", 0xblegh }'
   0

If 0xblegh is a number, that should print a NUL, not a literal zero.
So ok, maybe it's being treated as a string constant, so let's try %s:

   % awk </dev/null 'END { printf "%s\n", 0xblegh }'
   0

...nope. But wait, it gets weirder. Let's try forcing a conversion to
a number:

   % awk </dev/null 'END { printf "%s\n", (0xblegh + 0) }'
   00
   % awk </dev/null 'END { printf "%s\n", (0xblegh + 3) }'
   03
   % awk </dev/null 'END { printf "%s\n", (0xblegh - 5) }'
   0-5

Huh?

gawk also behaves in a similar way:

   % gawk < /dev/null 'END { printf "%d\n", 0xblegh }'
   11
   % gawk < /dev/null 'END { printf "%c\n", 0xblegh }'
   1
   % gawk < /dev/null 'END { printf "%s\n", 0xblegh }'
   11
   % gawk < /dev/null 'END { printf "%s\n", (0xblegh + 0) }'
   110

In fact, modulo gawk treating the number as 11 (0xb) because it
accepts hex constants, the behavior is identical. Furthermore, this
whole thing came to light because of this bug filed on mawk:

   
http://www.mail-archive.com/ubuntu-bugs%lists.ubuntu.com@localhost/msg1266528.html

I find this disturbing, especially the way in which + mystically turns
into string concatenation. Is there some strange way in which this
behavior is mandated by the awk specification?

>How-To-Repeat:

as above.

>Fix:

reject invalid numbers up front?



Home | Main Index | Thread Index | Old Index