NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-xen/55207: netbsd domU does not migrate properly from one xen host to another



The following reply was made to PR port-xen/55207; it has been noted by GNATS.

From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek%gmail.com@localhost>
To: "gnats-bugs%NetBSD.org@localhost" <gnats-bugs%netbsd.org@localhost>
Cc: 
Subject: Re: port-xen/55207: netbsd domU does not migrate properly from one
 xen host to another
Date: Mon, 11 May 2020 22:06:40 +0200

 FYI - investigated this a bit with -current, a way to trigger what
 seems to be similar (if not same) problem is to just xl save:
 
 asus-beast# xl save avx2 /mnt/zz-save
 Saving to /mnt/zz-save new xl format (info 0x3/0x0/1291)
 xc: info: Saving domain 4, type x86 PV
 libxl: error: libxl_dom_suspend.c:262:domain_suspend_common_pvcontrol_suspending:
 Domain 4:guest didn't acknowledge suspend, cancelling request
 xc: error: Domain has not been suspended: shutdown 0, reason 255: Internal error
 xc: error: Save failed (0 = Undefined error: 0): Internal error
 libxl: error: libxl_stream_write.c:350:libxl__xc_domain_save_done:
 Domain 4:saving domain: domain did not respond to suspend request:
 Undefined error: 0
 Failed to save domain, resuming domain
 xc: error: Dom 4 not suspended: (shutdown 0, reason 255): Internal error
 libxl: error: libxl_dom_suspend.c:472:libxl__domain_resume: Domain
 4:xc_domain_resume failed: Invalid argument
 
 with LOCKDEBUG kernel this triggers in the DomU:
 
 login: [ 342.0400600] xenbus_shutdown_handler: xenbus_rm 13
 [ 342.2800364] Flushing disk caches: 1 done
 [ 342.2900473] Mutex error: mutex_vector_enter,514: spin lock held
 
 [ 342.2900473] lock address : 0xffff9e80020121d0 type     :               spin
 [ 342.2900473] initialized  : 0xffffffff802133f9
 [ 342.2900473] shared holds :                  0 exclusive:                  1
 [ 342.2900473] shares wanted:                  0 exclusive:                  0
 [ 342.2900473] relevant cpu :                  0 last held:                  0
 [ 342.2900473] relevant lwp : 0xffff9e8002577b80 last held: 0xffff9e8002577b80
 [ 342.2900473] last locked* : 0xffffffff80212370 unlocked : 0xffffffff80213b96
 [ 342.2900473] owner field  : 0x0000000000010600 wait/spin:                0/1
 
 [ 342.2900473] panic: LOCKDEBUG: Mutex error: mutex_vector_enter,514:
 spin lock held
 [ 342.2900473] cpu0: Begin traceback...
 [ 342.2900473] vpanic() at netbsd:vpanic+0x146
 [ 342.2900473] snprintf() at netbsd:snprintf
 [ 342.2900473] lockdebug_more() at netbsd:lockdebug_more
 [ 342.2900473] mutex_enter() at netbsd:mutex_enter+0x342
 [ 342.2900473] event_remove_handler() at netbsd:event_remove_handler+0x26
 [ 342.2900473] xbd_xenbus_suspend() at netbsd:xbd_xenbus_suspend+0x91
 [ 342.2900473] device_pmf_driver_suspend() at
 netbsd:device_pmf_driver_suspend+0x46
 [ 342.2900473] pmf_device_suspend_locked() at
 netbsd:pmf_device_suspend_locked+0xeb
 [ 342.2900473] pmf_device_suspend() at netbsd:pmf_device_suspend+0x45
 [ 342.2900473] pmf_system_suspend() at netbsd:pmf_system_suspend+0xba
 [ 342.2900473] sysctl_xen_suspend() at netbsd:sysctl_xen_suspend+0xf1
 [ 342.2900473] sysctl_dispatch() at netbsd:sysctl_dispatch+0xa3
 [ 342.2900473] sys___sysctl() at netbsd:sys___sysctl+0xc5
 [ 342.2900473] syscall() at netbsd:syscall+0x9c
 [ 342.2900473] --- syscall (number 202) ---
 
 Trying to acquire same spinlock second time without LOCKDEBUG would
 simply deadlock, which seems to match the original report.
 
 Continuing investigation.
 


Home | Main Index | Thread Index | Old Index