tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/39002: harmful AWK extension: non-portable escaped character



> As the awk(1) manual says:

> String constants are quoted " ", with the usual C escapes recognized within.

> and:
This is not an argument. Just for comparison...

int main (int argc, char **argv)
{
        puts ("\z");

        return 0;
}

1) GNU CC
  main.c:7:8: warning: unknown escape sequence '\z'

2) Intel C
  main.c(7): warning #192: unrecognized character escape sequence
          puts ("\z");
                 ^
3) M$ C
  D:\Interix\home\Cheusov\prjs\main.c(7) : warning C4129: 'z' : unrecognized 
character escape sequence

> I believe the mistake that triggered all of this was in assuming that
> "gawk" can be used as an interpreter for a portable AWK language
> script.
I personally didn't say this ;-)

>  It cannot.  GAWK in its native mode is not AWK compatible.
Yes. It also doesn't support {n,m} syntax for by defaults
but does so with --posix option.

> In true AWK regular expressions are pure ("a `\' followed by any other
> character (matching that character taken as an ordinary character, as
> if the `\' had not been present)") and they are not cross-contaminated
> by C-like syntax in the way that GAWK's are.

> I don't know if GAWK's so-called "compatibility" mode corrects this
> difference or not.

0 ~>gawk --posix 'BEGIN {print "\z"}'
gawk: warning: escape sequence `\z' treated as plain `z'
z
0 ~>gawk 'BEGIN {print "\z"}'
gawk: warning: escape sequence `\z' treated as plain `z'
z
0 ~>
gawk - 3.1.5

And about {n,m} (another PR)
0 ~>echo aaa | gawk --posix '/a{3}/'
aaa
0 0 ~>echo 'a{3}' | gawk '/a{3}/'
a{3}
0 0 ~>
NetBSD should follow POSIX here.

 >> After successfully alienating and antagonizing your audience, don't be
 >> surprised people are not interested in hearing whatever rational
 >> argument you might actually have there.


> Thanks!  :-)
The only thing I really said is that "nawk is the only true awk" is
not an argument. Nothing more and nothing personal.

Just for fun: "SCO awk is the only true awk". See wip/heirloom-awk package.

  0 ~>/usr/pkg/heirloom/bin/awk 'BEGIN {print "\z"}'
  \z
  0 ~>/usr/pkg/heirloom/bin/oawk 'BEGIN {print "\z"}'
  \z
  0 ~>

oawk is older than nawk. Right?
"oawk is the only true AWK". oawk is actually used by Solaris as a
default.  In HP-UX too. MAWK used maily in Linux is not alone.

Using \z and similar DE FACTO work differently with other awk
interpreters. Why not to make NetBSD code more portable?  Also note,
that this particular extension is absolutely useless. It doesn't give
you any kind of new functionality.

-- 
Best regards, Aleksey Cheusov.


Home | Main Index | Thread Index | Old Index