A new-ish remote Xen server (that had never been rebooted, sigh) crashed and failed to reboot yesterday. It seemed to get its network configured but unfortunately it didn't seem to be starting sshd. Worse when an on-site person watched it boot it was clearing the screen quickly after showing the boot messages, but never presenting a login prompt, so no chance of really seeing how far it got (at least not without watching a video of the screen and pausing just before it blanked). Unfortunately the swap partition had not yet been enabled, so no crash dump from the original crash either. It's running NetBSD-7.1 (amd64) release with Xen-4.8: # uname -a NetBSD xentral 7.1 NetBSD 7.1 (XEN3_DOM0.201703111743Z) amd64 # pkg_info -c xentools\* | sed 1q Information for xentools48-4.8.0: # pkg_info -c xenkern\* | sed 1q Information for xenkernel48-4.8.0nb1: It would still boot a GENERIC kernel and nothing seemed wrong from there, so after some staring at logs I decided to comment out the xendomains list and try booting the xen dom0 kernel again. This time it booted fine, as expected, so I tried starting the first domU: # xl create /usr/pkg/etc/xen/b2.conf Parsing config from /usr/pkg/etc/xen/b2.conf load: 0.06 cmd: sh 1767 [pipe_rd] 0.00u 0.00s 0% 1432k load: 0.06 cmd: sh 1767 [pipe_rd] 0.00u 0.00s 0% 1432k load: 0.04 cmd: sh 1767 [pipe_rd] 0.00u 0.00s 0% 1432k libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/0 because of timeout libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/4 because of timeout libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/5 because of timeout libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/6 because of timeout libxl: error: libxl_create.c:1255:domcreate_launch_dm: unable to add disk devices xenstore-read: couldn't read path /local/domain/0/backend/vbd/1/6/vnd load: 0.02 cmd: xenstore-read 1646 [tstile] 0.00u 0.00s 0% 1200k Oh, oh, there's my problem! Starting a domU deadlocks in the kernel! First things first: add "sshd" to the "REQUIRE" line in /etc/rc.conf/xendomains! This was the main difficulty that required an on-site person to help reboot -- there wasn't even a login prompt on the console when it deadlocked. Both local and remote access MUST be enabled before trying to start any domU!!! Now, back to debugging. Here's the config for the sticky domU: # cat /usr/pkg/etc/xen/b2.conf | grep -v '^#' name = "b2" uuid = "96e60a02-9fc7-11e7-9f0d-d4ae528b9511" kernel = "/netbsd-7.99.34-XEN3_DOMU" extra = "root=xbd0a" memory = 10000 maxmem = 16000 vcpus = 8 vif = [ 'bridge=bridge0' ] disk = [ 'format=raw, vdev=0x0, access=rw, target=/dev/mapper/vg0-lv10', 'format=raw, vdev=0x1, access=rw, target=/dev/mapper/vg1-lv11', 'format=raw, vdev=0x2, access=rw, target=/dev/mapper/vg0-lv12', 'format=raw, vdev=0x3, access=rw, target=/dev/mapper/vg0-lv13', 'format=raw, vdev=0x4, access=rw, target=/dev/mapper/vg0-lv14', 'format=raw, vdev=0x5, access=rw, target=/dev/mapper/vg1-lv15', 'format=raw, vdev=0x6, access=ro, devtype=cdrom, target=/images/NetBSD-7.99.34-amd64.iso' ] OK, the error message suggested something wrong with vbd#6, so why the sudden problem with /images? # ls -l /images total 8639360 -r--r--r-- 1 woods users 715941888 Sep 19 20:26 NetBSD-5.2_STABLE-amd64.iso -rw-r--r-- 1 woods users 391266304 Mar 11 2017 NetBSD-7.1-amd64.iso -r--r--r-- 1 woods users 3313895424 Sep 20 00:12 NetBSD-7.99.34-amd64.iso hmmm... I forget, is /images a filesystem? # df [[ twiddle thumbs for a moment ... then hit <Ctrl-T> (how to linuxoids do without it!?!?!?!) ]] load: 0.01 cmd: df 2245 [tstile] 0.00u 0.00s 0% 1204k Oh oh, that IS a nasty deadlock for sure! google and read some old email messages and bug reports.... Run "crash": crash> ps PID LID S CPU FLAGS STRUCT LWP * NAME WAIT 386 > 1 7 0 0 ffffa00005ce04c0 crash 4397 1 3 0 80 ffffa00005c9f8c0 ksh pause 3213 1 3 0 80 ffffa00005c63060 ksh pause 1828 1 2 0 0 ffffa00005c63480 sshd 2795 1 3 0 80 ffffa00005bcb020 sshd select 2865 1 3 0 0 ffffa00005c638a0 ls tstile 2471 1 3 0 80 ffffa00005b59780 ksh pause 1464 1 3 0 80 ffffa00005c16880 sshd select 2598 1 3 0 80 ffffa00005c16460 sshd select 2245 1 3 0 1000000 ffffa000050f4ac0 df tstile 416 1 3 0 80 ffffa00005bb9840 ksh pause 1827 1 3 0 80 ffffa00005b59360 sshd select 1576 1 3 0 80 ffffa00004a00600 sshd select 1292 1 3 0 0 ffffa00005bb9420 xl tstile 1785 1 3 0 1000000 ffffa00005b26340 xl tstile 1264 1 3 0 80 ffffa00005b59ba0 ksh pause 905 1 3 0 0 ffffa00005bb9000 xenstore-read tstile 907 1 3 0 0 ffffa00005bcb860 xenstore-read tstile 1646 1 3 0 0 ffffa00005bcb440 xenstore-read tstile 906 1 3 0 80 ffffa00005c16040 xenstore-read xbrd 849 2 3 0 80 ffffa000059e1320 xl netio 849 1 3 0 80 ffffa00005b26b80 xl select 1373 1 3 0 80 ffffa000059e1b60 ksh pause 1219 1 3 0 80 ffffa00005b26760 ksh pause 1159 1 3 0 80 ffffa00005953b40 sshd select 322 1 3 0 80 ffffa000049eb1c0 sshd select 703 1 3 0 80 ffffa000049eba00 ksh pause 762 1 3 0 80 ffffa00004a1f620 sshd select 732 1 3 0 80 ffffa00004a001e0 sshd select 693 1 3 0 80 ffffa000059e1740 getty ttyraw 699 1 3 0 80 ffffa000059442e0 getty ttyraw 596 1 3 0 80 ffffa00004a00a20 getty ttyraw 531 1 3 0 80 ffffa000049eb5e0 getty ttyraw 630 1 3 0 80 ffffa00004870120 getty ttyraw 658 1 3 0 80 ffffa00005944700 cron nanoslp 644 1 3 0 80 ffffa000050c7aa0 inetd kqueue 289 1 3 0 80 ffffa00005953720 qmgr kqueue 290 1 3 0 80 ffffa00004e4fa80 pickup kqueue 702 1 3 0 80 ffffa00005953300 master kqueue 508 1 3 0 80 ffffa00005944b20 unbound kqueue 498 2 3 0 80 ffffa000050f46a0 xenconsoled netio 498 1 3 0 80 ffffa00004a1fa40 xenconsoled select 474 1 3 0 80 ffffa000053aa2c0 xenstored select 478 1 3 0 80 ffffa000053aab00 tset ttyraw 462 1 3 0 80 ffffa0000530b2a0 ksh pipe_rd 454 1 3 0 80 ffffa0000530b6c0 conserver select 445 1 3 0 80 ffffa0000530bae0 conserver select 307 2 3 0 80 ffffa000050f4280 sshguard nanoslp 307 1 3 0 80 ffffa00004a5a220 sshguard pipe_rd 421 1 3 0 80 ffffa000050c7680 sshd select 415 1 3 0 80 ffffa000050c7260 rwhod select 405 1 3 0 80 ffffa00004e4f660 powerd kqueue 411 1 3 0 80 ffffa00004e4f240 ntpd pause 372 1 3 0 80 ffffa00004a1f200 xenwatchdogd nanoslp 342 1 3 0 80 ffffa00004a5aa60 ipmon select 152 1 3 0 80 ffffa00004a5a640 syslogd kqueue 1 1 3 0 80 ffffa00004881160 init wait 0 53 3 0 200 ffffa00003ed04c0 ipmi0 ipmi0 0 52 3 0 200 ffffa000053aa6e0 xen_balloon xen_balloon 0 51 3 0 200 ffffa00004899180 physiod physiod 0 50 3 0 200 ffffa000049cd1a0 aiodoned aiodoned 0 49 3 0 200 ffffa000049cd5c0 ioflush syncer 0 48 3 0 200 ffffa000049cd9e0 pgdaemon pgdaemon 0 45 3 0 200 ffffa00004870540 usb7 usbevt 0 44 3 0 200 ffffa00004870960 usb6 usbevt 0 43 3 0 200 ffffa0000486f100 usb5 usbevt 0 42 3 0 200 ffffa0000486f520 usb4 usbevt 0 41 3 0 200 ffffa0000486f940 usb3 usbevt 0 40 3 0 200 ffffa000048200e0 usb2 usbevt 0 39 3 0 200 ffffa00004820500 usb1 usbevt 0 38 3 0 200 ffffa000048995a0 usb0 usbevt 0 37 3 0 200 ffffa000048999c0 cryptoret crypto_w 0 36 3 0 200 ffffa00004881580 unpgc unpgc 0 35 3 0 200 ffffa000048819a0 vmem_rehash vmem_rehash 0 34 3 0 200 ffffa00004871140 xenbus rdst 0 33 3 0 200 ffffa00004871560 xenwatch evtsq 0 32 3 0 200 ffffa00004871980 mfi0 mfi0 0 23 3 0 200 ffffa00004820920 usbtask-dr usbtsk 0 22 3 0 200 ffffa000044d50c0 usbtask-hc usbtsk 0 21 3 0 200 ffffa000044d54e0 scsibus0 sccomp 0 20 3 0 200 ffffa000044d5900 bnx1 bnx1 0 19 3 0 200 ffffa00003ed00a0 bnx0 bnx0 0 17 3 0 200 ffffa00003ed08e0 ipmi ipmipoll 0 16 3 0 200 ffffa00003eb0080 sysmon smtaskq 0 15 3 0 200 ffffa00003eb04a0 pmfsuspend pmfsuspend 0 14 3 0 200 ffffa00003eb08c0 pmfevent pmfevent 0 13 3 0 200 ffffa00003a13060 sopendfree sopendfr 0 12 3 0 200 ffffa00003a13480 nfssilly nfssilly 0 11 3 0 200 ffffa00003a138a0 cachegc cachegc 0 10 3 0 200 ffffa00003a11040 vrele vrele 0 9 3 0 200 ffffa00003a11460 vdrain vdrain 0 8 3 0 200 ffffa00003a11880 modunload mod_unld 0 7 3 0 200 ffffa00003a08020 xcall/0 xcall 0 6 1 0 200 ffffa00003a08440 softser/0 0 5 1 0 200 ffffa00003a08860 softclk/0 0 4 1 0 200 ffffa00003a06000 softbio/0 0 3 1 0 200 ffffa00003a06420 softnet/0 0 2 1 0 201 ffffa00003a06840 idle/0 0 1 3 0 200 ffffffff80c69860 swapper uvm crash> ps | fgrep tstile 2865 1 3 0 0 ffffa00005c638a0 ls tstile 2245 1 3 0 1000000 ffffa000050f4ac0 df tstile 1292 1 3 0 0 ffffa00005bb9420 xl tstile 1785 1 3 0 1000000 ffffa00005b26340 xl tstile 905 1 3 0 0 ffffa00005bb9000 xenstore-read tstile 907 1 3 0 0 ffffa00005bcb860 xenstore-read tstile 1646 1 3 0 0 ffffa00005bcb440 xenstore-read tstile crash> crash> crash> crash> t/a ffffa00005bcb440 trace: pid 1646 lid 1 at 0xffffa00065156c90 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 vn_write() at vn_write+0x9b dofilewrite() at dofilewrite+0x97 sys_write() at sys_write+0x5f syscall() at syscall+0x9a --- syscall (number 4) --- 7f7ff703c1fa: crash> crash> crash> crash> t/a ffffa00005bcb860 trace: pid 907 lid 1 at 0xffffa0006514f988 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 lookup_once() at lookup_once+0x1f4 namei_tryemulroot() at namei_tryemulroot+0x612 namei() at namei+0x43 vn_open() at vn_open+0x94 do_open() at do_open+0x111 do_sys_openat() at do_sys_openat+0x68 sys_open() at sys_open+0x24 syscall() at syscall+0x9a --- syscall (number 5) --- 7f7ff703c43a: crash> crash> crash> crash> t/a ffffa00005bb9000 trace: pid 905 lid 1 at 0xffffa0006514c988 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 lookup_once() at lookup_once+0x1f4 namei_tryemulroot() at namei_tryemulroot+0x612 namei() at namei+0x43 vn_open() at vn_open+0x94 do_open() at do_open+0x111 do_sys_openat() at do_sys_openat+0x68 sys_open() at sys_open+0x24 syscall() at syscall+0x9a --- syscall (number 5) --- 7f7ff703c43a: crash> crash> crash> crash> t/a ffffa00005bb9420 trace: pid 1292 lid 1 at 0xffffa00065149960 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 kernfs_root() at kernfs_root+0x41 lookup_once() at lookup_once+0x26e namei_tryemulroot() at namei_tryemulroot+0x612 namei() at namei+0x43 vn_open() at vn_open+0x94 do_open() at do_open+0x111 do_sys_openat() at do_sys_openat+0x68 sys_open() at sys_open+0x24 syscall() at syscall+0x9a --- syscall (number 5) --- 7f7ff603c43a: crash> crash> crash> crash> t/a ffffa00005c16040 trace: pid 906 lid 1 at 0xffffa00065172d08 sleepq_block() at sleepq_block+0x8e xenbus_dev_read() at xenbus_dev_read+0x47 kernfs_try_fileop() at kernfs_try_fileop+0x56 VOP_READ() at VOP_READ+0x3c vn_read() at vn_read+0x94 dofileread() at dofileread+0x90 sys_read() at sys_read+0x5f syscall() at syscall+0x9a --- syscall (number 3) --- 7f7ff703c3ba: crash> crash> crash> It looks like one of these is holding a lock while starting another. I'm guessing xenbus_dev_read() should not hold any locks when it goes to sleep waiting for that read to complete. I thought the problem might have something to do with the last domU disk being on the root filesystem, as those other commands that were hanging seemed to be hanging on the root filesystem. However after a reboot and re-configuring the .iso files to be on a separate filesystem it still hangs, so I think instead it has to do with something more central in the Xen and VFS code. # crash Crash version 7.1, image version 7.1. Output from a running system is unreliable. crash> ps PID LID S CPU FLAGS STRUCT LWP * NAME WAIT 1457 > 1 7 0 0 ffffa00005b6b360 crash 1033 1 3 0 80 ffffa00005b6b780 ksh pause 1313 1 3 0 0 ffffa00005afa340 xl tstile 1671 1 3 0 0 ffffa00005bd4860 xenstore-write tstile 1129 1 3 0 80 ffffa00005bea460 xenstore-read xbrd 1122 1 3 0 0 ffffa00005bc9840 xenstore-read tstile 1003 2 3 0 80 ffffa00005955300 xl netio 1003 1 3 0 80 ffffa00005afa760 xl select 642 1 3 0 80 ffffa00005afab80 ksh pause 331 1 3 0 80 ffffa000049eb1c0 ksh pause 1134 1 3 0 80 ffffa000049eba00 sshd select 328 1 3 0 80 ffffa0000597f740 sshd select 767 1 3 0 80 ffffa00004a1f620 ksh pause 590 1 2 0 0 ffffa00004a001e0 sshd 440 1 3 0 80 ffffa00004a00600 sshd select 693 1 3 0 80 ffffa0000597f320 getty ttyraw 475 1 3 0 80 ffffa000050bf260 getty ttyraw 756 1 3 0 80 ffffa00004a00a20 getty ttyraw 691 1 3 0 80 ffffa000049eb5e0 getty ttyraw 662 1 3 0 80 ffffa00004870120 getty ttyraw 594 1 3 0 80 ffffa00005955720 cron nanoslp 804 1 3 0 80 ffffa00005955b40 inetd kqueue 698 1 3 0 80 ffffa00004a1f200 qmgr kqueue 701 1 3 0 80 ffffa000057492e0 pickup kqueue 638 1 3 0 80 ffffa0000597fb60 master kqueue 516 1 3 0 80 ffffa00005749700 unbound kqueue 434 2 3 0 80 ffffa000054bd6c0 xenconsoled netio 434 1 3 0 80 ffffa00004a1fa40 xenconsoled select 479 1 3 0 80 ffffa000054ed6e0 tset ttyraw 380 1 3 0 80 ffffa000054ed2c0 xenstored select 463 1 3 0 80 ffffa000054bd2a0 ksh pipe_rd 474 1 3 0 80 ffffa000050e1280 conserver select 441 1 3 0 80 ffffa000054bdae0 conserver select 438 2 3 0 80 ffffa000054edb00 sshguard nanoslp 438 1 3 0 80 ffffa00004efb660 sshguard pipe_rd 451 1 3 0 80 ffffa000050e1ac0 sshd select 414 1 3 0 80 ffffa000050bf680 rwhod select 403 1 3 0 80 ffffa00004efb240 powerd kqueue 405 1 3 0 80 ffffa000050bfaa0 ntpd pause 365 1 3 0 80 ffffa00004a5a220 xenwatchdogd nanoslp 358 1 3 0 80 ffffa00004a5aa60 ipmon select 152 1 3 0 80 ffffa00004a5a640 syslogd kqueue 1 1 3 0 80 ffffa00004881160 init wait 0 54 5 0 200 ffffa00003ed04c0 (zombie) 0 53 3 0 200 ffffa00005749b20 xen_balloon xen_balloon 0 52 3 0 200 ffffa00004efba80 ipmi0 ipmi0 0 51 3 0 200 ffffa00004899180 physiod physiod 0 50 3 0 200 ffffa000049cd1a0 aiodoned aiodoned 0 49 3 0 200 ffffa000049cd5c0 ioflush syncer 0 48 3 0 200 ffffa000049cd9e0 pgdaemon pgdaemon 0 45 3 0 200 ffffa00004870540 usb7 usbevt 0 44 3 0 200 ffffa00004870960 usb6 usbevt 0 43 3 0 200 ffffa0000486f100 usb5 usbevt 0 42 3 0 200 ffffa0000486f520 usb4 usbevt 0 41 3 0 200 ffffa0000486f940 usb3 usbevt 0 40 3 0 200 ffffa000048200e0 usb2 usbevt 0 39 3 0 200 ffffa00004820500 usb1 usbevt 0 38 3 0 200 ffffa000048995a0 usb0 usbevt 0 37 3 0 200 ffffa000048999c0 cryptoret crypto_w 0 36 3 0 200 ffffa00004881580 unpgc unpgc 0 35 3 0 200 ffffa000048819a0 vmem_rehash vmem_rehash 0 34 3 0 200 ffffa00004871140 xenbus rdst 0 33 3 0 200 ffffa00004871560 xenwatch evtsq 0 32 3 0 200 ffffa00004871980 mfi0 mfi0 0 23 3 0 200 ffffa00004820920 usbtask-dr usbtsk 0 22 3 0 200 ffffa000044d50c0 usbtask-hc usbtsk 0 21 3 0 200 ffffa000044d54e0 scsibus0 sccomp 0 20 3 0 200 ffffa000044d5900 bnx1 bnx1 0 19 3 0 200 ffffa00003ed00a0 bnx0 bnx0 0 17 3 0 200 ffffa00003ed08e0 ipmi ipmipoll 0 16 3 0 200 ffffa00003eb0080 sysmon smtaskq 0 15 3 0 200 ffffa00003eb04a0 pmfsuspend pmfsuspend 0 14 3 0 200 ffffa00003eb08c0 pmfevent pmfevent 0 13 3 0 200 ffffa00003a13060 sopendfree sopendfr 0 12 3 0 200 ffffa00003a13480 nfssilly nfssilly 0 11 3 0 200 ffffa00003a138a0 cachegc cachegc 0 10 3 0 200 ffffa00003a11040 vrele vrele 0 9 3 0 200 ffffa00003a11460 vdrain vdrain 0 8 3 0 200 ffffa00003a11880 modunload mod_unld 0 7 3 0 200 ffffa00003a08020 xcall/0 xcall 0 6 1 0 200 ffffa00003a08440 softser/0 0 5 1 0 200 ffffa00003a08860 softclk/0 0 4 1 0 200 ffffa00003a06000 softbio/0 0 3 1 0 200 ffffa00003a06420 softnet/0 0 2 1 0 201 ffffa00003a06840 idle/0 0 1 3 0 200 ffffffff80c69860 swapper uvm crash> t/a ffffa00005bea460 trace: pid 1129 lid 1 at 0xffffa000651c9d08 sleepq_block() at sleepq_block+0x8e xenbus_dev_read() at xenbus_dev_read+0x47 kernfs_try_fileop() at kernfs_try_fileop+0x56 VOP_READ() at VOP_READ+0x3c vn_read() at vn_read+0x94 dofileread() at dofileread+0x90 sys_read() at sys_read+0x5f syscall() at syscall+0x9a --- syscall (number 3) --- 7f7ff703c3ba: crash> t/a ffffa00005bd4860 trace: pid 1671 lid 1 at 0xffffa000651b5a38 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 lookup_once() at lookup_once+0x1f4 namei_tryemulroot() at namei_tryemulroot+0x612 namei() at namei+0x43 fd_nameiat.isra.0() at fd_nameiat.isra.0+0x54 do_sys_statat() at do_sys_statat+0x72 sys___stat50() at sys___stat50+0x28 syscall() at syscall+0x9a --- syscall (number 439) --- 7f7ff710870a: crash> t/a ffffa00005bc9840 trace: pid 1122 lid 1 at 0xffffa000651a0c90 sleepq_block() at sleepq_block+0x8e turnstile_block() at turnstile_block+0x2c3 rw_vector_enter() at rw_vector_enter+0x11f genfs_lock() at genfs_lock+0x7b VOP_LOCK() at VOP_LOCK+0x37 vn_lock() at vn_lock+0x11 vn_write() at vn_write+0x9b dofilewrite() at dofilewrite+0x97 sys_write() at sys_write+0x5f syscall() at syscall+0x9a --- syscall (number 4) --- 7f7ff703c1fa: crash> I'm guessing the real problem is this xenbus_dev_read() going to sleep with a lock held. I don't see where the lock is created though. I'm not sure why others haven't encountered this deadlock yet. I'm also not sure why it didn't happen when these VMs were first configured. Timing must be everything I guess, though it's been 110% reproducible on my setup, and across many reboots. Entirely removing the .iso file from the device config still caused it to encounter some kind of problem, but this time it didn't deadlock and require a reboot: # xl create /usr/pkg/etc/xen/b2.conf Parsing config from /usr/pkg/etc/xen/b2.conf libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/0 because of timeout libxl: error: libxl_aoutils.c:539:async_exec_timeout: killing execution of /usr/pkg/etc/xen/scripts/block /local/domain/0/backend/vbd/1/5 because of timeout libxl: error: libxl_create.c:1255:domcreate_launch_dm: unable to add disk devices libxl: error: libxl.c:1575:libxl__destroy_domid: non-existant domain 1 libxl: error: libxl.c:1534:domain_destroy_callback: unable to destroy guest with domid 1 libxl: error: libxl.c:1463:domain_destroy_cb: destruction of domain 1 failed What's odd is that the two other VMs on this machine now start normally. This problem one happened to be first in the xendomains list, so was really causing a major headache, but it's the least important of them, so now I can debug it in relative calm. What's even more odd is that now the other VMs are running, this problem one has started A-OK and without a peep. HOWEVER! I see in src-current that there have been "recent" changes to xenbus_dev.c, and they get rid of the errant tsleep() call. I'd say these changes and any related to them are critical fixes that should be considered for pull up to netbsd-7. Thanks! -- Greg A. Woods <gwoods%acm.org@localhost> +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpbjaiPiXTAu.pgp
Description: PGP signature