tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Xen dom0 freeze after domU exit
Emmanuel Dreyfus <manu%netbsd.org@localhost> wrote:
> I follow up on my previous report: On XEN3_DOM0 kernel I very often freeze the
> kernel after a domU exits.
Enabling LOCKDEBUG turns the freeze into a panic:
Mutex error: mutex_vector_enter,520: spin lock held
lock address : 0xffffa0000102b5c8 type : spin
initialized : 0xffffffff8023a407
shared holds : 0 exclusive: 1
shares wanted: 0 exclusive: 0
current cpu : 0 last held: 0
current lwp : 0xffffa00000e005a0 last held: 0xffffa00000e005a0
last locked* : 0xffffffff8023b8a3 unlocked : 0xffffffff807c1074
owner field : 0x0000000000010600 wait/spin: 0/1
panic: LOCKDEBUG: Mutex error: mutex_vector_enter,520: spin lock held
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0xffffffff80205b75 cs 0xe030 rflags 0x246 cr2
0xffffa0002ccc8318 ilevel 0x8 rsp 0xffffa0002ccf7c30
curlwp 0xffffa00000e005a0 pid 0.39 lowest kstack 0xffffa0002ccf42c0
Stopped in pid 0.39 (system) at netbsd:breakpoint+0x5: leave
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
lockdebug_more() at netbsd:lockdebug_more
mutex_enter() at netbsd:mutex_enter+0x32e
evcnt_detach() at netbsd:evcnt_detach+0x18
event_remove_handler() at netbsd:event_remove_handler+0x132
xbdback_disconnect() at netbsd:xbdback_disconnect+0x42
xbdback_frontend_changed() at netbsd:xbdback_frontend_changed+0x2ef
xenwatch_thread() at netbsd:xenwatch_thread+0xd9
Here is what happens; in xbdback_disconnect(), we lock xbdi->xbdi_lock
which is a spin lock (initialized with IPL_BIO)
mutex_enter(&xbdi->xbdi_lock);
Then with that lock held, we call event_remove_handler() which calls
evcnt_detach()
evcnt_detach() then tries to acquire a sleep lock (initialized with
IPL_NONE):
mutex_enter(&evcnt_lock);
Hence we do an operation that may sleep while we hold a spin lock, a
practice that is forbidden in mutex(9):
> LWPs that own spin mutexes may not sleep, and therefore must
> not try to acquire adaptive mutexes or other sleep locks.
It seems the xbdi->xbdi_lock usage needs a change, but how? As I
understand, xbdi->xbdi_evtchn does not change during xbdi lifetime,
hence we could do the change below, which lets me destroy a xen domU
without the LOCKDEBUG panic "spin lock held"
--- sys/arch/xen/xen/xbdback_xenbus.c.orig
+++ sys/arch/xen/xen/xbdback_xenbus.c
@@ -674,16 +675,17 @@
static void
xbdback_disconnect(struct xbdback_instance *xbdi)
{
+ hypervisor_mask_event(xbdi->xbdi_evtchn);
+ event_remove_handler(xbdi->xbdi_evtchn, xbdback_evthandler,
+ xbdi);
+
mutex_enter(&xbdi->xbdi_lock);
if (xbdi->xbdi_status == DISCONNECTED) {
mutex_exit(&xbdi->xbdi_lock);
return;
}
- hypervisor_mask_event(xbdi->xbdi_evtchn);
- event_remove_handler(xbdi->xbdi_evtchn, xbdback_evthandler,
- xbdi);
/* signal thread that we want to disconnect, then wait for it */
xbdi->xbdi_status = DISCONNECTING;
cv_signal(&xbdi->xbdi_cv);
Opinions?
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu%netbsd.org@localhost
Home |
Main Index |
Thread Index |
Old Index