netbsd-bugs: Re: PR/33392 CVS commit: src/dist/nawk

Subject: Re: PR/33392 CVS commit: src/dist/nawk
To: None <gnats-bugs@NetBSD.org>
From: Aleksey Cheusov <cheusov@tut.by>
List: netbsd-bugs
Date: 06/26/2006 20:38:57

>  | >  | Thanks for applying the patch.
>  | >  | Just a minor note:
>  | >  | you forgot about xfree(f->gototab[i]) in function 'freefa'.
>  | >  
>  | >  Thanks,
>  | 
>  | You also forgot about chaning type of gototab array from uschar to int,
>  | typing it as uschar is another kind of limiting a number of states.
>  | <       uschar  **gototab;
>  | >       int     **gototab;
>  
>  Thanks,
>  
>  I think int is too wide. I made it unsigned short.

"640k is anough for everyone" ;)
Seriuosly, I often use awk with very large regexps for my work.
AFAIR, according to theory NFA for regexp
that looks like (a|b)*(a|b)^n has equivalent DFA with 2^N states, so 65536
states of DFA may correspond to NFA with only 16 (!!!) terminal
symbols.  IMHO this kind of internal limits is bad. I read
NetBSD philosophy but reality is that hardware changes fast.
My 5 years old Athlon-800/384Mb RAM is capable of propressing
DFAs including more than 2^16 states.
So, I personally would prefer 'int' type for the states.

P.S.
I saw HEAD changes in awk code and was surprized that
lots of snprintf functions was changed to sprintf,
and strlcpy to strcpy. Is this really ok?

P.P.S
Where is nawk upstream? Who maintains that YYYYMMDD versions?

-- 
Best regards, Aleksey Cheusov.