NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-xen/53965: XEN DomU fails to poweroff in new HEAD kernels



>Number:         53965
>Category:       port-xen
>Synopsis:       XEN DomU fails to poweroff in new HEAD kernels
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-xen-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Feb 09 14:20:00 +0000 2019
>Originator:     Robert Elz
>Release:        NetBSD 8.99.30
>Organization:
>Environment:
System: NetBSD jinx.noi.kre.to 8.99.30 NetBSD 8.99.30 (1.1-20190114) #9: Mon Jan 14 13:29:08 ICT 2019 kre%onyx.coe.psu.ac.th@localhost:/usr/obj/testing/kernels/amd64/JINX amd64
Architecture: x86_64
Machine: amd64
	however the architecture and machine are the right type,
	assuming you stick a XEN Dom0 on there, and then run a DomU
	(PV mode) on top of that.

>Description:
	Sometime in the recent past, my XEN DomU kernels lost the
	ability to power off the (virtual) machine (and cause the
	XEN hypervisor to let go.)

	It was working with 8.99.32 (of some vintage, the last kernel
	update I did was probably a couple of weeks ago) and failed
	with the first update to a 8.99.34 kernel I attempted (though
	the first time I just assumed I left the "-p" off the shutdown
	command by accident, and didn't think any more of it.)

	Note I am running a very old (by current standards) Xen kernel
	and Dom0 system, however they are (and have been) working well,
	and as (ignoring my uses for testing up to date current kernels,
	and particularly, shells) this is a production system I'm in no
	great hurry to update it.

	pkg_info on the Dom0 tells me:

	onyx$ pkg_info | grep xen
	xentools42-4.2.5nb15 Userland Tools for Xen 4.2.x
	xenkernel42-4.2.2   Xen 4.2.x Kernel

	(with some other noise that just happened to match.)

	The kernel I am running is not GENERIC, in fact it contains
	almost nothing that is not absolutely required for a Xen DomU
	(so very few drivers, file systems etc).   For the test shown
	below I simply made a new (up to the minute) kernel and userland,
	and booted that, then more or less immediately (after I noticed
	the build had finished and it was running) did the commands
	below.   Nothing has changed in the kernel config in a long time.

	The date/time in the uname output (kernel build time) is
	UTC+0700, the running kernel does not have any TZ configired
	and the dates shown there are (as indicated) simply UTC.

	13:51:52 UTC is about 10 mins later than 20:41:06 UTC+0700.
	There was a cvs update done immediately before the (-u) build
	(which did not take very long.)

	===> build.sh started:    Sat Feb  9 20:07:25 ICT 2019
	===================  DONE: Sat Feb 9 20:45:32 ICT 2019


>How-To-Repeat:

netbsd# uname -a
NetBSD netbsd.noi.kre.to 8.99.34 NetBSD 8.99.34 (MUNNARI-DomU) #408: Sat Feb  9 20:41:06 ICT 2019  kre%onyx.coe.psu.ac.th@localhost:/usr/obj/testing/kernels/amd64/MUNNARI-DomU amd64
netbsd# shutdown -p now
Shutdown NOW!
shutdown: [pid 421]
netbsd# wall: You have write permission turned off; no reply possible
                                                                               
*** FINAL System shutdown message from root%netbsd.noi.kre.to@localhost ***            
System going down IMMEDIATELY                                                  
                                                                               
                                                                               
Feb  9 13:51:52 netbsd shutdown: poweroff by root: 

System shutdown time has arrived

About to run shutdown hooks...
Stopping cron.
Stopping inetd.
Saved entropy to /var/db/entropy-file.
Sat Feb  9 13:51:54 UTC 2019

Done running shutdown hooks.
Feb  9 13:51:59 netbsd syslogd[178]: Exiting on signal 15
[ 386.2701065] syncing disks... done

[ 386.2900919] The operating system has halted.
[ 386.2900919] Please press any key to reboot.

after which it is possible to simply exit the console and
"xl destroy ..." withoutg problems.

>Fix:
	??

	If no-one can easily spot which change might have caused
	this, I can bisect and test with just the cost of the time
	to do the cvs updates and builds.

>Unformatted:
 	All this is from the system where I am doing the send-pr
 	and is unrelated to the system with the problem...


Home | Main Index | Thread Index | Old Index