NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/56952: UVM deadlock in madvise vs. munmap

>Number:         56952
>Category:       kern
>Synopsis:       UVM deadlock in madvise vs. munmap
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 03 20:20:00 +0000 2022
>Originator:     David A. Holland
>Release:        NetBSD 9.99.97 (20220602)
System: NetBSD valkyrie 9.99.97 NetBSD 9.99.97 (VALKYRIE_LOCKDEBUG) #1: Wed Jun 22 23:56:00 EDT 2022  dholland@valkyrie:/usr/src/sys/arch/amd64/compile/VALKYRIE_LOCKDEBUG amd64
Architecture: x86_64
Machine: amd64

I have a few times hit a deadlock while running some database stress
tests, and today caught it with UVM_PAGE_TRKOWN enabled.

The dead state is as follows:

Thread 1 is in madvise(MADV_DONTNEED) and is holding a read lock on
the process's map. It is waiting in putpages to chuck one of the pages.

Thread 2 is in uvm_fault_internal; it is holding the page and trying
to get a read lock on the map.

Thread 3 is in munmap; it is waiting for a write lock on the map, and
that converts this into a deadlock.

(This is all in one process.)

Taylor constructed the following narrative for how it got this way
(any transcription errors are my fault):

<Riastradh> Presumably you have an object foo which is mapped at
   0xdeadbee000 in the address space
<Riastradh> 1. Someone tried to read from page 0xdeadbef000, say,
   which is the range [0x1000, 0x2000) in foo.
<Riastradh> They consulted the map which determined that range in foo.
<Riastradh> They released the map lock, then allocated a page and
   punched it into foo, and they want to reacquire the map lock to
   punch it into the pmap.
<Riastradh> 2. Someone else tried to madvise(MADV_DONTNEED) some
   range, say [0xdeadbee000, 0xdeadbf6000), in foo, and chuck all the
<Riastradh> Took the map read lock to that 0xdeadbef000 is mapped to
   foo@0x1000, entered genfs_io_chuck_all_the_pages or whatever, and
   then started waiting for the page that (1) allocated for
<Riastradh> Except I got the order wrong again and this last player
   actually started first, but whatever.
<Riastradh> 3. At the same time, someone else tried to unmap
   0xdeadbef000, which requires taking a _write_ lock.
<Riastradh> which threw a wrench in the whole thing
<Riastradh> So, one obvious possibility is: make uvm_map_clean drop
   the map lock while doing genfs_io_chuck_all_the_pages.
<Riastradh> (pgo_put)




Home | Main Index | Thread Index | Old Index