NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/45718: processes sometimes get stuck and spin in vm_map
The following reply was made to PR kern/45718; it has been noted by GNATS.
From: Taylor R Campbell <campbell+netbsd%mumble.net@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: Re: kern/45718: processes sometimes get stuck and spin in vm_map
Date: Sun, 9 Dec 2012 06:13:43 +0000
Every once in a while I still find processes stuck like this. Tonight
I investigated closer with the help of crash(8). Here are some clues
I've gathered:
- The processes that are stuck are all stuck in the cv_timedwait in
uvm_map_prepare deep inside exec.
- I can still exec new processes, so the problem is *not* simply kva
exhaustion causing the uvm_km_alloc in exec_pool_alloc to hang
indefinitely waiting for kva -- all the requests for kva are for the
same size, so if I can make new requests, the old ones should be
serviceable too.
- Examination of the lwp structures and their l_timeout_ch members
reveals that the callouts for cv_timedwait are firing (the callouts'
c_time values keep changing), so it's not that the sleepq mechanism is
stuck or anything.
- Examintion of kernel_map itself reveals that UVM_MAP_WANTVA is
persistently flagged.
So it looks like something freed up kva, but failed to signal to the
waiters that the kva was freed up.
I looked around for a race condition surrounding map->flags, but
although there is a wacky locking dance surrounding vm_maps, I didn't
see anything obvious there: uvm_unmap_remove looks like it does the
right thing to signal to the waiters. However, I suspect that
uvm_map_replace and uvm_map_extract can free up space in the map, and
neither of them signals UVM_MAP_WANTVA waiters.
There may be other places in uvm_map.c that free up space -- it's huge
and I haven't gone through it all. Is it plausible that at least
uvm_map_replace and uvm_map_extract, and perhaps other parts of
uvm_map.c as well, need to signal UVM_MAP_WANTVA waiters?
Caveat: I am currently using a kernel from back in March, because
something broke related to drm in more recent kernels, but uvm hasn't
changed much lately, so I suspect the problem is still here.
Home |
Main Index |
Thread Index |
Old Index