Subject: why does 'ld -n' generate programs that get an immediate 'Abort' on -current?
To: NetBSD Toolchain Technical Discussion List <tech-toolchain@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: tech-toolchain
Date: 07/15/2001 14:16:09
I was porting some old Unix code last night to NetBSD/i386 and once I
finally got it all to compile I was stunned that not only would it not
run, but even GDB couldn't catch it at main()!

Finally this morning I discovered that the old makefile had "-n" in the
bunch of LDFLAGS being passed to the linker.

Even the one-line hello.c dies the same way.  It simply drops out with
"Abort" immediately:

	$ echo 'main(){printf("hello world\n");}' > hello.c
	$ cc -o hello -n hello.c
	$ ./hello
	Abort
	$ gdb ./hello
	GNU gdb 4.17
	Copyright 1998 Free Software Foundation, Inc.
	GDB is free software, covered by the GNU General Public License, and you are
	welcome to change it and/or distribute copies of it under certain conditions.
	Type "show copying" to see the conditions.
	There is absolutely no warranty for GDB.  Type "show warranty" for details.
	This GDB was configured as "i386--netbsd"...(no debugging symbols found)...
	(gdb) break main
	Breakpoint 1 at 0x8048393
	(gdb) run
	Starting program: /work/woods/tmp/./hello 
	
	Program terminated with signal SIGABRT, Aborted.
	The program no longer exists.
	You can't do that without a process to debug
	(gdb) quit

Now that's _really_ weird!

However luckily I discovered right away that this also causes a -DDEBUG
kernel to complain:

Jul 15 13:59:28 proven /netbsd: vmcmd[0] = 0x8048080/0xa000 @ 0x80
Jul 15 13:59:28 proven /netbsd: execve: vmcmd 0 failed: 22

$ errno 22
#define EINVAL  22        /* Invalid argument */

Hmmm..... what's that about?  It appears this function-pointer call in
kern_exec.c fails during the creating of the VM segments for the new
process:

                error = (*vcp->ev_proc)(p, vcp);

Now I'm not well enough versed in the VM bits of NetBSD so that's about
as far as I think I can go without further hints....

Turns out this happens on -current i386 and sparc but not 1.3.3 sparc,
so this is beginning to look like either an ELF or maybe a UVM issue.


Now I'd forgotten all about '-n', but it seems it's still in GNU ld:

       -n     sets  the  text segment to be read only, and NMAGIC
              is written if possible.

But why would this cause an instant abort before main() on NetBSD ELF?
Is, as the syslog message suggests, the -current kernel incapable of
running binaries created with '-n', and if so, why?  Can it be fixed?
Or should it just be documented?  Should I send-pr?  Or should this move
to tech-kern?

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>