Subject: kern/26803: sigexit() has no barrier for other LWPs
To: None <>
From: None <>
List: netbsd-bugs
Date: 08/29/2004 15:16:52
>Number:         26803
>Category:       kern
>Synopsis:       sigexit() has no barrier for other LWPs
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Aug 29 22:16:00 UTC 2004
>Originator:     Jason R Thorpe
>Release:        NetBSD 2.0G
        -- Jason R. Thorpe <>
System: NetBSD 2.0G NetBSD 2.0G (YEAH-BABY-XP) #26: Thu Jul 15 08:26:49 PDT 2004 i386
Architecture: i386
Machine: i386
	sigexit() has a flaw for multi-threaded programs: while it
	sets a userret hook to suspend other LWPs, it doesn't wait
	for them to actually suspend.

	This means that other LWPs for the process that might be
	sleeping in the kernel may wake up and modify the process's
	address space while the core dump is taking place.

	Another issue (which even has an XXX in the code) is that
	other LWPs that might be running in userpace on other
	processors don't get jolted into the kernel to suspend
	themselves; there is simply no code to do this.

	I believe the lack of barrier has something to do with
	corrupted core files being dumped by a multi-threaded
	application I am working with that performs a lot of
	mmap / write (thus modifies the process's VM map and
	sleeps a lot while doing it).

	I will work on a simple test case to show the problematic