Subject: bin/25084: ksh dumps core
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <jmmv@menta.net>
List: netbsd-bugs
Date: 04/07/2004 12:59:45
>Number:         25084
>Category:       bin
>Synopsis:       ksh dumps core
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 07 11:00:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Julio M. Merino Vidal
>Release:        NetBSD 2.0B
>Organization:
Julio M. Merino Vidal <jmmv@menta.net>
The NetBSD Project - http://www.NetBSD.org/
>Environment:
	
	
System: NetBSD dawn.local 2.0B NetBSD 2.0B (DAWN) #0: Thu Apr 1 15:41:07 CEST 2004 jmmv@dawn.local:/home/build/obj/usr/src/sys/arch/i386/compile/DAWN i386
Architecture: i386
Machine: i386
>Description:
	/bin/ksh crashes often when used in pkgsrc (specially by the buildlink
	wrappers and libtool).  I haven't been able to determine why, but got
	a way to reproduce it.

>How-To-Repeat:
	Add this to your /etc/mk.conf:

	SH= /bin/ksh
	.SHELL: name=ksh path=/bin/ksh

	Rebuild and reinstall pkgsrc/devel/libtool-base, so that it uses ksh
	instead of sh.

	Go to pkgsrc/devel/libgnomeui, issue a 'make' and wait for the crash.
	The same should happen in pkgsrc/sysutils/gnome-vfs2 or even in
	pkgsrc/x11/gtk2.

	The backtrace:

	#0  0x481058da in strcmp () from /usr/lib/libc.so.12
	#1  0x0805b658 in error_prefix (fileline=1) at io.c:139
	#2  0x0805b532 in internal_errorf (jump=1, 
	    fmt=0x6c657665 <Address 0x6c657665 out of bounds>) at io.c:114
	#3  0x08061ee1 in aerror (ap=0x8079020, 
	    msg=0x6c657665 <Address 0x6c657665 out of bounds>) at main.c:838
	#4  0x08049fc4 in afree (ptr=0x1, ap=0x8079020) at alloc.c:421
	#5  0x0806ae22 in setstr (vq=0xb, s=0x81cb000 "", error_ok=1) at var.c:378
	#6  0x0805702f in execute (t=0x80d1bb0, flags=0) at exec.c:328
	#7  0x08056e21 in execute (t=0x8104950, flags=0) at exec.c:192
	#8  0x0805704a in execute (t=0x80d11b0, flags=0) at exec.c:329
	#9  0x08056e21 in execute (t=0x81079c0, flags=0) at exec.c:192
	#10 0x08056c75 in execute (t=0x80a2928, flags=0) at exec.c:159
	#11 0x08056e21 in execute (t=0x81b72b8, flags=0) at exec.c:192
	#12 0x08056c75 in execute (t=0x8089100, flags=0) at exec.c:159
	#13 0x080619cc in shell (s=0x80801a8, toplevel=1) at main.c:616
	#14 0x08061255 in main (argc=22, argv=0xbfbfe600) at main.c:429
	#15 0x0804990a in ___start ()

	Looking at the code, the following lines in alloc.c, afree() function,
	are the ones causing the crash:

	if (dp < &bp->cell[NOBJECT_FIELDS] || dp >= bp->last)
	        aerror(ap, "freeing memory outside of block (corrupted?)");

	The code tries to show the error message (which itself is a bug,
	because is detecting something wrong), and the error routine ends
	crashing.

>Fix:
	Tried with the pdksh found in pkgsrc, with the same results.

	Tried with OpenBSD's ksh, which worked fine (they have replaced all
	ksh's memory management (the alloc.c file) with their own one; maybe
	we should use it too)...
>Release-Note:
>Audit-Trail:
>Unformatted: