netbsd-bugs: Re: PR/33392 CVS commit: src/dist/nawk

Subject: Re: PR/33392 CVS commit: src/dist/nawk
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Aleksey Cheusov <cheusov@tut.by>
List: netbsd-bugs
Date: 06/26/2006 17:40:02

The following reply was made to PR bin/33392; it has been noted by GNATS.

From: Aleksey Cheusov <cheusov@tut.by>
To: gnats-bugs@NetBSD.org
Cc: gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: PR/33392 CVS commit: src/dist/nawk
Date: Mon, 26 Jun 2006 20:38:57 +0300

 >  | >  | Thanks for applying the patch.
 >  | >  | Just a minor note:
 >  | >  | you forgot about xfree(f->gototab[i]) in function 'freefa'.
 >  | >  
 >  | >  Thanks,
 >  | 
 >  | You also forgot about chaning type of gototab array from uschar to int,
 >  | typing it as uschar is another kind of limiting a number of states.
 >  | <       uschar  **gototab;
 >  | >       int     **gototab;
 >  
 >  Thanks,
 >  
 >  I think int is too wide. I made it unsigned short.

 "640k is anough for everyone" ;)
 Seriuosly, I often use awk with very large regexps for my work.
 AFAIR, according to theory NFA for regexp
 that looks like (a|b)*(a|b)^n has equivalent DFA with 2^N states, so 65536
 states of DFA may correspond to NFA with only 16 (!!!) terminal
 symbols.  IMHO this kind of internal limits is bad. I read
 NetBSD philosophy but reality is that hardware changes fast.
 My 5 years old Athlon-800/384Mb RAM is capable of propressing
 DFAs including more than 2^16 states.
 So, I personally would prefer 'int' type for the states.

 P.S.
 I saw HEAD changes in awk code and was surprized that
 lots of snprintf functions was changed to sprintf,
 and strlcpy to strcpy. Is this really ok?

 P.P.S
 Where is nawk upstream? Who maintains that YYYYMMDD versions?

 -- 
 Best regards, Aleksey Cheusov.