Subject: kern/19260: ELF core dumps can be inconsistent
To: None <gnats-bugs@gnats.netbsd.org>
From: None <nathanw@wasabisystems.com>
List: netbsd-bugs
Date: 12/03/2002 19:48:40
>Number:         19260
>Category:       kern
>Synopsis:       ELF core dumps can be inconsistent
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 03 16:50:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     Nathan J. Williams
>Release:        NetBSD 1.6K 2 December 2002
>Organization:
	Wasabi Systems, Inc.
>Environment:
System: NetBSD marvin-the-martian.stuartst.com 1.6K NetBSD 1.6K (MARVIN.MP) #9: Mon Dec 2 10:51:25 EST 2002 nathanw@marvin-the-martian.stuartst.com:/u1/nbsd/src/sys/arch/i386/compile/MARVIN.MP i386
Architecture: i386
Machine: i386
>Description:
	
The ELF core dump process makes three passes over a process's memory map
when dumping core: one to count the number of sections, one to generate
header information for each section, and one to actually write out the
sections.

The third pass to write out the sections can change the set of entries in
the memory map by splitting a large entry; the iteration performed by
uvm_coredump_walkmap() will walk over the split-out new entries after
walking over the large one, writing the data twice and making the offsets
of all subsequent data not match the locations given in the header, wreaking
havoc if you actually look at the core dump in a debugger.

The path for splitting the map entry is (on a local filesystem):

coredump_writesegs_elf32 -> vn_rdwr -> VOP_WRITE -> ffs_write -> i386_copyin ->
 -trap- -> trap -> uvm_fault -> uvmfault_amapcopy -> uvm_map_clip_end.


>How-To-Repeat:

Install xmms from pkgsrc, and send it SIGABRT. Note that the core dump
is much larger (probably about 2M larger) than the headers indicate it
should be, and that data after the split point isn't where the headers
say it is.

>Fix:

Passing FALSE instead of TRUE for the "canchunk" parameter from
uvmfault_amapcopy() to amap_copy() avoids the problem but probably has
other side effects. It also doesn't avoid any other ways the map could
change (are there any?).

Changing the iteration of uvm_coredump_walkmap() slightly to get the next
entry before performing the possibly-splitting function call, as in the
following patch, also appears to avoid the problem, although I'm not 100%
sure what else can happen to the list of map entries during the write process.

Index: uvm_glue.c
===================================================================
RCS file: /cvsroot/syssrc/sys/uvm/uvm_glue.c,v
retrieving revision 1.44.2.19
diff -u -r1.44.2.19 uvm_glue.c
--- uvm_glue.c	2002/10/18 03:42:16	1.44.2.19
+++ uvm_glue.c	2002/12/04 00:46:27
@@ -702,14 +702,14 @@
 	struct uvm_coredump_state state;
 	struct vmspace *vm = p->p_vmspace;
 	struct vm_map *map = &vm->vm_map;
-	struct vm_map_entry *entry;
+	struct vm_map_entry *entry, *next;
 	vaddr_t maxstack;
 	int error;
 
 	maxstack = trunc_page(USRSTACK - ctob(vm->vm_ssize));
 
-	for (entry = map->header.next; entry != &map->header;
-	     entry = entry->next) {  
+	for (entry = map->header.next, next = entry->next;
+	     entry != &map->header; entry = next, next = entry->next) {  
 		/* Should never happen for a user process. */
 		if (UVM_ET_ISSUBMAP(entry))
 			panic("uvm_coredump_walkmap: user process with "
>Release-Note:
>Audit-Trail:
>Unformatted: