Subject: bin/30324: usr.bin/sed is buggy
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <pancake@phreaker.net>
List: netbsd-bugs
Date: 05/24/2005 14:49:00
>Number:         30324
>Category:       bin
>Synopsis:       BSD sed wraps the regexp when it's too long.
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue May 24 14:49:00 +0000 2005
>Originator:     pancake@phreaker.net
>Release:        NetBSD 3.99.3
>Organization:
	
>Environment:
	
	
System: NetBSD pl2 3.99.3 NetBSD 3.99.3 (pancake-laptop) #2: Mon Apr 25 15:41:52 CEST 2005 root@pl2:/usr/src/sys/arch/i386/compile/PANCAKE_LAPTOP i386
Architecture: i386
Machine: i386
>Description:
	The BSD sed (Net|Open|Free)BSD doesn't shows the proper error message
	when a regexp longer than _POSIX2_LINE_MAX is passed as argument.

	Now bsd-sed emits the "unterminated substitute pattern" message,
	because BSD-sed cuts the argument passed to POSIX2_LINE_MAX chars,
	and obviously the expreg isn't terminated :)

	GNU solves that using alloc functions, but talking with jmmv I
	understand that the way is just using static buffers following the
	POSIX standards.

	The fix must check the passed string and show a "too large regexp"
	message instead of "unterminated substitute pattern".

>How-To-Repeat:
	I find the bug using wip/acr with too many export variables, the
	final ./configure script works fine on GNU systems, but not in BSD
	ones. You can test if this works testing this shell script:

sed -e 's,@MANDIR@,aaaaa,g;s,@INFODIR@,,g;s,@LIBDIR@,,g;s,@LOCALSTATEDIR@,,g;s,@SYSCONFDIR@,/usr/local/etc,g;s,@DATADIR@,/usr/local/share,g;s,@LIBEXECDIR@,/usr/local/libexec,g;s,@SBINDIR@,/usr/local/sbin,g;s,@BINDIR@,/usr/local/bin,g;s,@EPREFIX@,/usr/local,g;s,@PREFIX@,/usr/local,g;s,@SPREFIX@,/usr/local,g;s,@TARGET@,i386-unknown-netbsd,g;s,@HOST@,i386-unknown-netbsd,g;s,@BUILD@,i386-unknown-netbsd,g;s,@INSTALL@,/usr/bin/install,g;s,@INSTALL_PROGRAM@,/usr/bin/install -s,g;s,@INSTALL_DIR@,/usr/bin/install -d,g;s,@INSTALL_SCRIPT@,/usr/bin/install,g;s,@INSTALL_DATA@,/usr/bin/install -m 644,g;s,@PKGNAME@,wistumbler2,g;s,@VERSION@,2.0pre10,g;s,@CONTACT@,pancake <pancake@phreaker.net>,g;s,@CONTACT_NAME@,pancake,g;s,@CONTACT_MAIL@,pancake@phreaker.net,g;s,@CC@,gcc,g;s,@CFLAGS@,,g;s,@LDFLAGS@,,g;s,@HAVE_LANG_C@,1,g;s,@PTHREAD_LIBS@,-lpthread,g;s,@HAVE_PTHREAD@,1,g;s,@HAVE_LIB_PCAP@,1,g;s,@USE_GTK@,1,g;s,@GTK_CFLAGS@,-DXTHREADS -I/usr/pkg/include/gtk-2.0 -I/usr/pkg/lib/gtk-2.0/include
  -I/usr/pkg/include -I/usr/X11R6/include -I/usr/pkg/include/atk-1.0 -I/usr/pkg/include/pango-1.0 -I/usr/pkg/include/freetype2 -I/usr/pkg/include/glib/glib-2.0 -I/usr/pkg/lib/glib-2.0/include,g;s,@GTK_LDFLAGS@,-Wl\,-R/usr/pkg/lib -L/usr/pkg/lib -lgtk-x11-2.0 -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lm -lpangoxft-1.0 -lpangox-1.0 -lpango-1.0 -lgobject-2.0 -lgmodule-2.0 -lglib-2.0,g;s,@USE_BEEP@,1,g;s,@HAVE_MACHINE_SPEAKER_H@,1,g;s,@HAVE_MACHINE_SPEAKER_H@,0,g;s,@USE_GTK@,1,g;s,@HAVE_LIB_PCAP@,1,g;s,@GTK_CFLAGS@,-DXTHREADS -I/usr/pkg/include/gtk-2.0 -I/usr/pkg/lib/gtk-2.0/include -I/usr/pkg/include -I/usr/X11R6/include -I/usr/pkg/include/atk-1.0 -I/usr/pkg/include/pango-1.0 -I/usr/pkg/include/freetype2 -I/usr/pkg/include/glib/glib-2.0 -I/usr/pkg/lib/glib-2.0/include,g;s,@GTK_LDFLAGS@,-Wl\,-R/usr/pkg/lib -L/usr/pkg/lib -lgtk-x11-2.0 -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lm -lpangoxft-1.0 -lpangox-1.0 -lpango-1.0 -lgobject-2.0 -lgmodule-2.0 -lglib-2.0,g;s,@HOST_CPU@,i386,g;s
 ,@HOST_OS@,netbsd,g;s,@BUILD_CPU@,i386,g;s,@BUILD_OS@,netbsd,g;s,@TARGET_CPU@,i386,g;s,@TARGET_OS@,netbsd,g;s,@CPP@,cpp,g;s,@INSTALL@,/usr/bin/install,g;'

	The sed expression works fine if you drop some chars (try dropping
	the 'a's after @MANDIR@.

	NOTE: This is the SED expresion generated by ACR in the latest
	wistumbler2 (not yet released).

>Fix:
	I wrote a simple patch. 

Index: main.c
===================================================================
RCS file: /cvsroot/src/usr.bin/sed/main.c,v
retrieving revision 1.16
diff -u -r1.16 main.c
--- main.c      13 Jul 2004 12:11:06 -0000      1.16
+++ main.c      24 May 2005 14:38:30 -0000
@@ -253,7 +253,7 @@
                        if (n-- <= 1) {
                                *p = '\0';
                                linenum++;
-                               return (buf);
+                               err(FATAL,"regexp too long");
                        }
                        switch (*s) {
                        case '\0':

>Unformatted: