Subject: Re: bin/36444: flex generates bad C++ code
To: None <gnats-bugs@NetBSD.org, gnats-admin@netbsd.org,>
From: Christos Zoulas <christos@zoulas.com>
List: netbsd-bugs
Date: 06/05/2007 10:56:33
On Jun 5,  2:30pm, mlelstv@serpens.de (mlelstv@serpens.de) wrote:
-- Subject: bin/36444: flex generates bad C++ code

| >Number:         36444
| >Category:       bin
| >Synopsis:       flex generates bad C++ code
| >Confidential:   no
| >Severity:       serious
| >Priority:       medium
| >Responsible:    bin-bug-people
| >State:          open
| >Class:          sw-bug
| >Submitter-Id:   net
| >Arrival-Date:   Tue Jun 05 14:30:01 +0000 2007
| >Originator:     Michael van Elst
| >Release:        NetBSD 4.0_BETA2
| >Organization:
| -- 
|                                 Michael van Elst
| Internet: mlelstv@serpens.de
|                                 "A potential Snark may lurk in every tree."
| >Environment:
| 	
| 	
| System: NetBSD henery 4.0_BETA2 NetBSD 4.0_BETA2 (HENERY) #1: Sun Jun 3 12:09:36 CEST 2007 mlelstv@henery:/home/netbsd4/obj/home/netbsd4/src/sys/arch/i386/compile/HENERY i386
| Architecture: i386
| Machine: i386
| >Description:
| 
| When compiling a flex source from the net/irrtoolkit-nox11 package,
| I get an error message from the C++ compiler about the ambigous
| call to an overloaded function.
| 
| >How-To-Repeat:
| 
| Here is a test case that shows the problem:
| 
| -------- snip --------
| %option case-insensitive
| 
| %{
| #include <cstdio>
| #include <cstring>
| %}
| 
| %%
| 
| [A-Z][A-Z0-9]* {
| 	printf("word = %s\n",yytext);
| }
| 
| %%
| 
| class Object {
| public:
| 	char *contents;
| 	unsigned long size;
| 	Object(const char buf[]) {
| 		contents = strdup(buf);
| 		size = strlen(buf);
| 	}
| };
| 
| int length(const char *s)
| {
| 	return strlen(s);
| }
| 
| int main() {
| 	Object *o = new Object("1 Word");
| 	void *p;
| 	p = yy_scan_bytes(o->contents, o->size);
| 	BEGIN(INITIAL);
| }
| 
| extern "C" {
| int yywrap() {
| 	return 1;
| }
| }
| -------- snip --------
| 
| % flex c.l
| % % c++ lex.yy.c 
| c.l: In function 'int main()':
| c.l:34: error: call of overloaded 'yy_scan_bytes(char*&, long unsigned int&)' is ambiguous
| lex.yy.c:1321: note: candidates are: yy_buffer_state* yy_scan_bytes(const char*, yy_size_t)
| lex.yy.c:1355: note:                 yy_buffer_state* yy_scan_bytes(const char*, int)
| 
| The reason for this is a change in src/usr.bin/lex/flex.skl:1.21
| 
| | Traditional flex uses int instead of yy_size_t for some api functions.
| | Unfortunately this mangles differently in c++, so we get undefined symbols.
| | So we define the old function prototype to keep things happy.
| 
| This creates function duplicates for C (using yy_size_t) and C++ (using int)
| that cause the ambiguity.
| 
| >Fix:
| 
| Reverting the change in flex.skl:1.21 solves the problem.

The problem should be fixed by changing:

 	unsigned long size;
to:
	yy_size_t size;
or:
	int size;
or even:
	size_t size;
	
We could add a few more yy_scan_bytes() functions so that we have explicit
matches for unsigned long and long, but it is not worth the trouble. Passing
a long where an int is expected is not a good practice anyway.

christos