NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-xen/57560: xennetback_xenbus.c rev 1.109 broke guests, causes DOM0 reboot



>Number:         57560
>Category:       port-xen
>Synopsis:       xennetback_xenbus.c rev 1.109 broke guests, causes DOM0 reboot
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    port-xen-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 04 01:55:00 +0000 2023
>Originator:     brad%anduin.eldar.org@localhost
>Release:        NetBSD 10.0_BETA
>Organization:
	eldar.org
>Environment:
System: NetBSD 10.0_BETA or -current as of 2023-08-03, XEN3_DOM0 kernel
Architecture: x86_64
Machine: amd64
>Description:

Given a DOM0 with Xen 4.15.3 built from pkgsrc 2022Q3 on a 32GB box
with Intel Xeon E-2224 CPU:

There have been a small number of non whitespace updates to the Xen
network backend in -current:

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/xen/xen/xennetback_xenbus.c?rev=1.109&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/xen/xen/xennetback_xenbus.c?rev=1.110&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/xen/xen/xennetback_xenbus.c?rev=1.111&content-type=text/x-cvsweb-markup&only_with_tag=MAIN

These all appear in NetBSD 10.0_BETA as:

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/xen/xen/xennetback_xenbus.c?rev=1.108.4.1&content-type=text/x-cvsweb-markup&only_with_tag=netbsd-10

Starting at 1.109, when ANY Xen guest starts to use its network
interfaces, the DOM0 mentioned will lock up for a bit and reboot.
There is no entering DDB and no panic message.  On a NetBSD guest this
is when /etc/rc.d/network runs, and on a Linux ArchLinux guest it is
when systemd.networkd runs (or equiv) executes.  The DOMU guest kernel
probe for the network device just fine.

I have no ability to have a serial console on the DOM0, but it is
possible that the hypervisior is panicing and rebooting with some sort
of message output, as opposed to just doing something spontaneous.
Reverting back to 1.108 of xennetback_xenbus.c allows the guest to
start up and run as before.  I did not try 1.110 and 1.111 independent
of 1.109.

If this problem happens to others it may mess with the ability to
update the kernel on their DOM0 system even if a newer Xen works with
>= 1.109.

>How-To-Repeat:

Try to use a recent -current, the one I used was 10.99.6, or NetBSD
10.0_BETA as a DOM0 on Xen 4.15.3 from pkgsrc 2022Q3.

>Fix:

I fixed the problem by reverting back to a previous version of the
code, but it seems likely that some of the fixes are desirable from
reading the commit messages.



Home | Main Index | Thread Index | Old Index