Subject: Re: PR/33392 CVS commit: src/dist/nawk
To: None <gnats-bugs@NetBSD.org>
From: Aleksey Cheusov <firstname.lastname@example.org>
Date: 07/03/2006 13:41:08
> | > I think int is too wide. I made it unsigned short.
> | "640k is anough for everyone" ;)
> | Seriuosly, I often use awk with very large regexps for my work.
> | AFAIR, according to theory NFA for regexp
> | that looks like (a|b)*(a|b)^n has equivalent DFA with 2^N states, so 65536
> | states of DFA may correspond to NFA with only 16 (!!!) terminal
> | symbols. IMHO this kind of internal limits is bad. I read
> | NetBSD philosophy but reality is that hardware changes fast.
> | My 5 years old Athlon-800/384Mb RAM is capable of propressing
> | DFAs including more than 2^16 states.
> | So, I personally would prefer 'int' type for the states.
> I thought that this is limited by NCHARS+3. I will change it.
Changes you commited to the HEAD related to this PR seems good to me,
everything works correctly and much faster than gawk (for huge
regexps) that i used for years.
1) have you a plan to notify Brian about bug found?
2) have you a plan to add an additional regression test for awk?
> | P.S.
> | I saw HEAD changes in awk code and was surprized that
> | lots of snprintf functions was changed to sprintf,
> | and strlcpy to strcpy. Is this really ok?
> They were not done carefully so bugs were introduced and we decided
> to back them out until someone does them carefully.
> | P.P.S
> | Where is nawk upstream? Who maintains that YYYYMMDD versions?
> from /usr/src/doc/3RDPARTY.
> Package: nawk
> Version: 2005-04-24
> Current Vers: 2005-04-24
> Maintainer: Brian Kernighan <email@example.com> (Lucent Technologies)
> Archive Site: http://cm.bell-labs.com/who/bwk/
> Home Page: http://cm.bell-labs.com/who/bwk/
Best regards, Aleksey Cheusov.