Port-xen archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems booting netbsd DOM0



At Mon, 10 May 2021 19:42:42 +0200, oskar%fessel.org@localhost wrote:
Subject: Re: Problems booting netbsd DOM0
>
> maybe this is a little bit better:
> 3... 2... 1...  (XEN) *** Serial input to DOM0 (type 'CTRL-a'
> three times to switch input) (XEN) Freed 612kB init memory (XEN)
> emul-priv-op.c:1173:d0v0 WRMSR 0x00000277 val 0x0007010600070106
> unimplemented (XEN) d0v0 Unhandled general protection fault
> fault/trap [#13, ec=0000] (XEN) domain_crash_sync called from
> entry.S: fault at ffff82d040318336
> x86_64/entry.S#create_bounce_frame+0x15d/0x177 (XEN) Domain 0
> (vcpu#0) crashed on cpu#13:  (XEN) ----[ Xen-4.15.0nb0 x86_64
> debug=y Not tainted ]---- (XEN) CPU:  13 (XEN) RIP:
> e033:[<ffffffff80226e1a>] (XEN) RFLAGS: 0000000000000202 EM: 1
> CONTEXT: pv guest (d0v0) (XEN) rax: 0000000000070106 rbx:
> 00000000013a1000 rcx: 0000000000000277 (XEN) rdx: 0000000000070106
> rsi: deadbeefdeadf00d rdi: ffffffff80e29cc0 (XEN) rbp:
> ffffffff8139ff00 rsp: ffffffff8139fe70 r8:  0000000000000a58 (XEN)
> r9:  ffffffff80fc3f80 r10: 0000000000000014 r11: ffffffff80e29d14
> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14:
> ffffffff8139b000 (XEN) r15: 0000000000000000 cr0: 0000000080050033
> cr4: 0000000000042660 (XEN) cr3: 0000001031366000 cr2:
> 0000000000000000 (XEN) fsb: 0000000000000000 gsb: ffffffff80e29cc0
> gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000
> ss: e02b cs: e033 (XEN) Guest stack trace from
> rsp=ffffffff8139fe70:  (XEN) 0000000000000277 ffffffff80e29d14
> 0000000000000000 ffffffff80226e1a (XEN) 000000010000e030
> 0000000000010002 ffffffff8139feb8 000000000000e02b (XEN)
> ffffffff8139ff00 ffffffff80243c3e 0000000000000000
> 0000000000000000 (XEN) 0000000000000000 0000000000000000
> 00000000756e6547 0000000000000000 (XEN) 0000000000000000
> 0000000000000000 0000000000000000 ffffffff8023b0b1 (XEN)
> 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 (XEN) 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 (XEN) 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 (XEN)
> 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 (XEN) 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 (XEN) 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 (XEN)
> 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 (XEN) 0000000000000000 0000000000000000 (XEN)
> Hardware Dom0 crashed: rebooting machine in 5 seconds.

For history:

I was seeing very much the same thing trying to boot an older custom
XEN3_DOM0 kernel under any Xen newer than 4.13 (i.e. 4.15 and 4.18).  I
could boot 9.3 and a 10.0_RC2 kernels, so I knew the problem had to be
in NetBSD and that there was a fix somewhere!

(XEN) [2023-12-26 18:50:24.294] *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) [2023-12-26 18:50:24.294] Freed 644kB init memory
(XEN) [2023-12-26 18:50:24.296] d0v0 Unhandled: vec 13, #GP[0000]
(XEN) [2023-12-26 18:50:24.296] domain_crash_sync called from entry.S: fault at ffff82d040203b58 x86_64/entry.S#create_bounce_frame+0x14f/0x167
(XEN) [2023-12-26 18:50:24.296] Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) [2023-12-26 18:50:24.296] ----[ Xen-4.18.0_20231116nb0  x86_64  debug=n  Not tainted ]----
(XEN) [2023-12-26 18:50:24.296] CPU:    0
(XEN) [2023-12-26 18:50:24.296] RIP:    e033:[<ffffffff8022722a>]
(XEN) [2023-12-26 18:50:24.296] RFLAGS: 0000000000000202   EM: 1   CONTEXT: pv guest (d0v0)
(XEN) [2023-12-26 18:50:24.296] rax: 0000000000070106   rbx: 00000000019fc000   rcx: 0000000000000277
(XEN) [2023-12-26 18:50:24.296] rdx: 0000000000070106   rsi: ffffffff819faeac   rdi: ffffffff80e7dcc0
(XEN) [2023-12-26 18:50:24.296] rbp: ffffffff819faf00   rsp: ffffffff819fae70   r8:  0000000000000a58
(XEN) [2023-12-26 18:50:24.296] r9:  ffffffff8101bfe0   r10: 0000000000000018   r11: ffffffff80e7dd08
(XEN) [2023-12-26 18:50:24.296] r12: 0000000000000000   r13: 0000000000000000   r14: ffffffff819f6000
(XEN) [2023-12-26 18:50:24.296] r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000026e0
(XEN) [2023-12-26 18:50:24.296] cr3: 00000008699c1000   cr2: 0000000000000000
(XEN) [2023-12-26 18:50:24.296] fsb: 0000000000000000   gsb: ffffffff80e7dcc0   gss: 0000000000000000
(XEN) [2023-12-26 18:50:24.296] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) [2023-12-26 18:50:24.296] Guest stack trace from rsp=ffffffff819fae70:
(XEN) [2023-12-26 18:50:24.296]    0000000000000277 ffffffff80e7dd08 0000000000000000 ffffffff8022722a
(XEN) [2023-12-26 18:50:24.296]    000000010000e030 0000000000010002 ffffffff819faeb8 000000000000e02b
(XEN) [2023-12-26 18:50:24.296]    ffffffff819faf00 ffffffff80243c3e 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 00000000756e6547 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 ffffffff8023b0b1
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296]    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) [2023-12-26 18:50:24.296] Hardware Dom0 crashed: rebooting machine in 5 seconds.


With Xen 4.15 the messages were indeed slightly different as Oskar
showed:

(XEN) [2023-12-24 20:45:41.007] Xen is relinquishing VGA console.
(XEN) [2023-12-24 20:45:41.009] *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
(XEN) [2023-12-24 20:45:41.010] Freed 640kB init memory
(XEN) [2023-12-24 20:45:41.012] d0v0 Unhandled general protection fault fault/trap [#13, ec=0000]
(XEN) [2023-12-24 20:45:41.012] domain_crash_sync called from entry.S: fault at ffff82d0402fe4a8 x86_64/entry.S#create_bounce_frame+0x14f/0x167
(XEN) [2023-12-24 20:45:41.012] Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) [2023-12-24 20:45:41.012] ----[ Xen-4.15.5nb2  x86_64  debug=n  Not tainted ]----

It would be nice to have a GDB function that would do a stack unwind on
the stack trace shown by Xen.  Perhaps it exists?  How to feed it the
stack trace though?  Isn't there a Xen option to dump the dom0?  If that
could work then crash(8) might help....

Anyway....

This is the fix in NetBSD which avoids generating this "vec 13"/#13
general protection fault/trap during early initialisation, which I found
by manually going through all of the seemingly Xen-related pullups
between 9.2 and 9.3:

----------------------------
revision 1.410
date: 2021-04-17 11:03:21 -0700;  author: bouyer;  state: Exp;  lines: +4 -2;  commitid: 2MAaVeu5WEBkOFPC;
Make pat_init() a NOOP on XENPV; it causes a trap with Xen 4.15
----------------------------
Index: pmap.c
===================================================================
RCS file: /cvs/master/m-NetBSD/main/src/sys/arch/x86/x86/pmap.c,v
retrieving revision 1.409
retrieving revision 1.410
diff -u -u -r1.409 -r1.410
--- pmap.c	6 Feb 2021 21:24:19 -0000	1.409
+++ pmap.c	17 Apr 2021 18:03:21 -0000	1.410
@@ -1,4 +1,4 @@
-/*	$NetBSD: pmap.c,v 1.409 2021/02/06 21:24:19 jdolecek Exp $	*/
+/*	$NetBSD: pmap.c,v 1.410 2021/04/17 18:03:21 bouyer Exp $	*/

 /*
  * Copyright (c) 2008, 2010, 2016, 2017, 2019, 2020 The NetBSD Foundation, Inc.
@@ -130,7 +130,7 @@
  */

 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.409 2021/02/06 21:24:19 jdolecek Exp $");
+__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.410 2021/04/17 18:03:21 bouyer Exp $");

 #include "opt_user_ldt.h"
 #include "opt_lockdebug.h"
@@ -915,6 +915,7 @@
 void
 pat_init(struct cpu_info *ci)
 {
+#ifndef XENPV
 	uint64_t pat;

 	if (!(ci->ci_feat_val[0] & CPUID_PAT))
@@ -928,6 +929,7 @@

 	wrmsr(MSR_CR_PAT, pat);
 	cpu_pat_enabled = true;
+#endif
 }

 static pt_entry_t


--
					Greg A. Woods <gwoods%acm.org@localhost>

Kelowna, BC     +1 250 762-7675           RoboHack <woods%robohack.ca@localhost>
Planix, Inc. <woods%planix.com@localhost>     Avoncote Farms <woods%avoncote.ca@localhost>

Attachment: pgpr_Sw0zF5mI.pgp
Description: OpenPGP Digital Signature



Home | Main Index | Thread Index | Old Index