Subject: bin/4392: bug in yacc
To: None <gnats-bugs@gnats.netbsd.org>
From: John F. Woods <jfw@jfwhome.funhouse.com>
List: netbsd-bugs
Date: 10/29/1997 23:20:25
>Number:         4392
>Category:       bin
>Synopsis:       yacc -o file.cpp curdles result
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people (Utility Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 29 20:35:04 1997
>Last-Modified:
>Originator:     John F. Woods
>Organization:
Misanthropes-R-Us
>Release:        netbsd-current sup October 25, 1997
>Environment:
	
System: NetBSD jfwhome.funhouse.com 1.3_ALPHA NetBSD 1.3_ALPHA (JFW) #25: Sat Oct 25 12:18:52 EDT 1997 jfw@jfwhome.funhouse.com:/usr/src/sys/arch/i386/compile/JFW i386


>Description:
(I have also mailed this to the author of Berkeley yacc.)
I have found two minor bugs in yacc.

The first might just be regarded as a misuse, but:  the output from
YACC is close enough to C++ that it can be fed to a C++ compiler
easily enough (especially desirable if the parser includes some C++
code in rules, rather than vanilla C).  However, if you ask yacc to
send the output to a C++ file at the same time as generating a defines
header file, you end up curdling the "C++" output (and not generating
a separate defines file)!

To test this, pick any handy yacc file (test.y) and run

	yacc -d -o test.cxx test.y

You will find (unless this has been fixed since NetBSD and Cygnus
picked it up last) that only test.cxx has been generated, and the
initial part of the file contains the defines!

This behavior is obvious when looking at the code for setting the
defines file name:  it copies output_file_name to defines_file_name,
then IF output_file_name ends in .c, it changes the end of
defines_file_name to .h:  otherwise, it leaves it identical to
defines_file_name!

You might argue that the file name really should end in .c, in which
case if the strcmp fails, there ought to be a warning message and it
should either abort entirely or shut off the -d flag.  I'd argue that
it's convenient to be able to output into a C++ file, and it appears
to work as C++ (they haven't mutated the language far enough yet ;-).
Perhaps if it doesn't recognize the suffix of the output file, the
defines should go to the usual y.tab.h; in the patch below, however, I
chose a middle ground of checking for common C++ extensions, and
shutting off the -d flag with a warning if the extension isn't
recognized.  If you think that's reasonable behavior, then here's the
code to do it.  (Note that my patch always uses .h for a suffix,
rather than the occasionally-used C++ convention of .hxx; my group
just uses .h for C++ headers, so I stuck with it.  Provincial, I know.)

Bug 2:  while staring at the aforementioned code, I noticed that it
does a MALLOC(strlen(output_file_name)).  It needs to malloc one more
byte, of course, to hold the NUL at the end.  Whoops!  That is also
part of the patch below.

>How-To-Repeat:
for any yacc file test.y:
	yacc -d -o test.cxx test.y

>Fix:

The patch:  (the comments about Windows come from the fact that I
discovered this doing programming on NT (ewwww) but I've generated the
patch using the source that ships with NetBSD.)

*** main.c.orig	Wed Oct 29 22:28:02 1997
--- main.c	Wed Oct 29 22:53:09 1997
***************
*** 369,380 ****
      {
  	if (explicit_file_name)
  	{
! 	    defines_file_name = MALLOC(strlen(output_file_name));
  	    if (defines_file_name == 0)
  		no_space();
  	    strcpy(defines_file_name, output_file_name);
! 	    if (!strcmp(output_file_name + (strlen(output_file_name)-2), ".c"))
! 		defines_file_name [strlen(output_file_name)-1] = 'h';
  	}
  	else
  	{
--- 369,399 ----
      {
  	if (explicit_file_name)
  	{
!	    char *suffix;
!	    defines_file_name = MALLOC(strlen(output_file_name)+1);
  	    if (defines_file_name == 0)
  		no_space();
  	    strcpy(defines_file_name, output_file_name);
! 	    /* does the output_file_name have a known suffix */
!             if ((suffix = strrchr(output_file_name,'.')) != 0
!             &&  (!strcmp(suffix,".c") ||   /* good, old-fashioned C */
!                  !strcmp(suffix,".C") ||   /* C++, or C on Windows */
!                  !strcmp(suffix,".cc") ||  /* C++ */
!                  !strcmp(suffix,".cxx") || /* C++ */
!                  !strcmp(suffix,".cpp")))  /* C++ (Windows) */
!             {
!                 strncpy(defines_file_name, output_file_name,
!                         suffix - output_file_name + 1);
!                 defines_file_name[suffix - output_file_name + 1] = 'h';
!                 defines_file_name[suffix - output_file_name + 2] = 0;
!             } else {
!                 fprintf(stderr,"%s: suffix of output file name %s"
!                                " not recognized, no -d file generated.\n",
!                         myname, output_file_name);
!                 dflag = 0;
!                 free(defines_file_name);
!                 defines_file_name = 0;
!             }
  	}
  	else
  	{

>Audit-Trail:
>Unformatted: