NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen 4.18.5_20250521nb0 not ELF binary (Was: Re: EFI and Xen)



On 5/26/2025 2:06 AM, Manuel Bouyer wrote:
> On Sun, May 25, 2025 at 11:40:12PM -0400, Chuck Zmudzinski wrote:
>> On 5/25/2025 11:25 AM, Manuel Bouyer wrote:
>> > On Sun, May 25, 2025 at 08:54:37AM -0400, Chuck Zmudzinski wrote:
>> >> No difference with the larger conring_size, unfortunately.
>> > 
>> > Can you try a -current kernel, or a very recent kernel from the netbsd-10
>> > branch ? There have been fixes in the last few days that could help with
>> > very recent CPUs
>> > 
>> 
>> No change with the latest -current or -10 from daily builds. Still need to disable
>> smt in Xen to enumerate all the vcpus.
> 
> Well, I'm out of idea then, sorry. The next step would be to use the Xen
> debug tools to see where the hang occurs
> 

Using a Fedora build of Xen 4.18.4 instead of pkgsrc Xen gets further
into a boot of dom0, but it still crashes, but with the Fedora version
of Xen I am getting some crash traces from dom0 that might help.

Also, with the Fedora version of Xen, it is not necessary to set cet=no-ibt
to prevent Xen from crashing, and with the Fedora version of Xen, Xen does
not spam the console ring buffer with many messages like:

(XEN) altcall iommu_enable_x2apic+0x1e/0x70 dest intel_iommu_enable_eim has no endbr64

So at the least I think we can try to build Xen in pkgsrc so it behaves more
like the version of Xen from Fedora on this newer Intel CPU which works
better at the moment for debugging NetBSD dom0 than the pkgsrc version of Xen.

When using the serial console for Xen, the dom0 crash is from the PCI serial
card used by xen. I had set console=com2 for dom0, and maybe it will get past
that crash if I set console=xencons for dom0 instead, but I ran out of time
before I could try it today.

I also tried booting using vga console for both the Fedora version of Xen and
dom0 and it crashed and I saw some crash trace also on the vga console from dom0,
but I did not have time to investigate it further except it appeared the crash
happened when dom0 was trying to attach the nvme pci device.

Starting tomorrow, I will be able to spend more time on this problem.

I was able to capture the messages when dom0 was using the serial console
and dom0 crashed when trying to attach the serial that console the Fedora
version of Xen was using. Here are the details:

 Xen 4.18.4
(XEN) Xen version 4.18.4 (root@) (gcc (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3)) debug=n Sun May 25 15:14:09 EDT 2025
(XEN) Latest ChangeSet: 
(XEN) build-id: d28f5ab2a49a656ab8a071e53296793651dc0783
(XEN) Bootloader: NetBSD/x86 EFI Boot (x64), Revision 1.2 (Mon Dec 16 13:08:11 UTC 2024) (from NetBSD 10.1)
(XEN) Command line: dom0_mem=2G com2=9600,8n1,0x40c0,16,1:0.0 console=com2 conring_size=256k pv-l1tf=false

snip ...

[   1.0000000] vcpu13 at hypervisor0

[   1.0000000] vcpu13: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu13: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu14 at hypervisor0

[   1.0000000] vcpu14: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu14: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu15 at hypervisor0

[   1.0000000] vcpu15: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu15: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu16 at hypervisor0

[   1.0000000] vcpu16: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu16: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu17 at hypervisor0

[   1.0000000] vcpu17: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu17: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu18 at hypervisor0

[   1.0000000] vcpu18: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu18: node 0, package 0, core 0, smt 0

[   1.0000000] vcpu19 at hypervisor0

[   1.0000000] vcpu19: Intel(R) Core(TM) i5-14500, id 0xb06f2

[   1.0000000] vcpu19: node 0, package 0, core 0, smt 0

[   1.0000000] xenbus0 at hypervisor0: Xen Virtual Bus Interface

[   1.0000000] xencons0 at hypervisor0: Xen Virtual Console Driver

[   1.0000000] acpi0 at hypervisor0: Intel ACPICA 20221020

[   1.0000000] ACPI: Dynamic OEM Table Load:

[   1.0000000] ACPI: SSDT 0xFFFFD480049D2808 000394 (v02 PmRef  Cpu0Cst  00003001 INTL 20200717)

[   1.0000000] ACPI: Dynamic OEM Table Load:

[   1.0000000] ACPI: SSDT 0xFFFFD480049D6008 000581 (v02 PmRef  Cpu0Ist  00003000 INTL 20200717)

[   1.0000000] ACPI: Dynamic OEM Table Load:

[   1.0000000] ACPI: SSDT 0xFFFFD48003FC8E48 0001AB (v02 PmRef  Cpu0Psd  00003000 INTL 20200717)

[   1.0000000] acpi0: fixed power button present

[   1.0000030] hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)

[   1.0000030] acpiec0 at acpi0 (H_EC, PNP0C09-1): not present

[   1.0000030] attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0

[   1.0000030] acpivga0 at acpi0 (GFX0): ACPI Display Adapter

[   1.0000030] acpiout0 at acpivga0 (DD01, 0x0001): ACPI Display Output Device

[   1.0000030] acpiout1 at acpivga0 (DD02, 0x0002): ACPI Display Output Device

[   1.0000030] acpiout2 at acpivga0 (DD03, 0x0003): ACPI Display Output Device

[   1.0000030] acpiout3 at acpivga0 (DD04, 0x0004): ACPI Display Output Device

[   1.0000030] acpiout4 at acpivga0 (DD05, 0x0005): ACPI Display Output Device

[   1.0000030] acpiout5 at acpivga0 (DD06, 0x0006): ACPI Display Output Device

[   1.0000030] acpiout6 at acpivga0 (DD07, 0x0007): ACPI Display Output Device

[   1.0000030] acpiout7 at acpivga0 (DD08, 0x0008): ACPI Display Output Device

[   1.0000030] acpiout8 at acpivga0 (DD09, 0x0009): ACPI Display Output Device

[   1.0000030] acpiout9 at acpivga0 (DD0A, 0x000a): ACPI Display Output Device

[   1.0000030] acpiout10 at acpivga0 (DD0B, 0x000b): ACPI Display Output Device

[   1.0000030] acpiout11 at acpivga0 (DD0C, 0x000c): ACPI Display Output Device

[   1.0000030] acpiout12 at acpivga0 (DD0D, 0x000d): ACPI Display Output Device

[   1.0000030] acpiout13 at acpivga0 (DD0E, 0x000e): ACPI Display Output Device

[   1.0000030] acpiout14 at acpivga0 (DD0F, 0x000f): ACPI Display Output Device

[   1.0000030] acpiout15 at acpivga0 (DD1F, 0x001f): ACPI Display Output Device

[   1.0000030] acpiout16 at acpivga0 (DD2F, 0x001f): ACPI Display Output Device

[   1.0000030] LNK0 (SONY362A) at acpi0 not configured

[   1.0000030] AWAC (ACPI000E) at acpi0 not configured

[   1.0000030] acpibut0 at acpi0 (SLPB, PNP0C0E): ACPI Sleep Button

[   1.0000030] acpiwmi0 at acpi0 (WFDE, PNP0C14-DSarDev): ACPI WMI Interface

[   1.0000030] acpiwmibus at acpiwmi0 not configured

[   1.0000030] acpiwmi1 at acpi0 (WFTE, PNP0C14-TestDev): ACPI WMI Interface

[   1.0000030] acpiwmibus at acpiwmi1 not configured

[   1.0000030] PEPD (INT33A1) at acpi0 not configured

[   1.0000030] acpibut1 at acpi0 (PWRB, PNP0C0C): ACPI Power Button

[   1.0000030] TPM (MSFT0101) at acpi0 not configured

[   1.0000030] ACPI: Enabled 4 GPEs in block 00 to 7F

[   1.0000030] pci0 at hypervisor0 bus 0: configuration mode 1

[   1.0000030] pchb0 at pci0 dev 0 function 0: Intel Raptor Lake (S,6+8) Host (rev. 0x02)

[   1.0000030] ppb0 at pci0 dev 1 function 0: Intel Alder Lake PCIe G5 Root Port 0 (x16) (rev. 0x02)

[   1.0000030] ppb0: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x16 @ 32.0GT/s

[   1.0000030] ppb0: link is x1 @ 2.5GT/s

[   1.0000030] pci1 at ppb0 bus 1

[   1.0000030] puc0 at pci1 dev 0 function 0: Nanjing QinHeng Electronics CH382 (com, com)

[   1.0000030] com2 at puc0 port 0 (16850-compatible): panic: Failed to bind physical IRQ 16



[   1.0000030] cpu0: Begin traceback...

[   1.0000030] vpanic() at netbsd:vpanic+0x177

[   1.0000030] panic() at netbsd:panic+0x3c

[   1.0000030] bind_pirq_to_evtch() at netbsd:bind_pirq_to_evtch+0xa8

[   1.0000030] intr_establish_xname() at netbsd:intr_establish_xname+0xe2

[   1.0000030] pci_intr_establish_xname_internal() at netbsd:pci_intr_establish_xname_internal+0xde

[   1.0000030] com_puc_attach() at netbsd:com_puc_attach+0x192

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] puc_attach() at netbsd:puc_attach+0x3a1

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] pci_probe_device1() at netbsd:pci_probe_device1+0x66c

[   1.0000030] pci_enumerate_bus1() at netbsd:pci_enumerate_bus1+0x1c8

[   1.0000030] pciattach() at netbsd:pciattach+0x18e

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] ppbattach() at netbsd:ppbattach+0x21f

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] pci_probe_device1() at netbsd:pci_probe_device1+0x66c

[   1.0000030] pci_enumerate_bus1() at netbsd:pci_enumerate_bus1+0x1c8

[   1.0000030] pciattach() at netbsd:pciattach+0x18e

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] mp_pci_scan() at netbsd:mp_pci_scan+0xd6

[   1.0000030] hypervisor_attach() at netbsd:hypervisor_attach+0x589

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_found() at netbsd:config_found+0x110

[   1.0000030] xen_mainbus_attach() at netbsd:xen_mainbus_attach+0x9c

[   1.0000030] config_attach_internal() at netbsd:config_attach_internal+0x20c

[   1.0000030] config_rootfound() at netbsd:config_rootfound+0x6e

[   1.0000030] cpu_configure() at netbsd:cpu_configure+0x25

[   1.0000030] main() at netbsd:main+0x2f1

[   1.0000030] cpu0: End traceback...

[   1.0000030] fatal breakpoint trap in supervisor mode

[   1.0000030] trap type 1 code 0 rip 0xffffffff802378c5 cs 0xe030 rflags 0x202 cr2 0 ilevel 0x8 rsp 0xffffffff817efd20

[   1.0000030] curlwp 0xffffffff81012740 pid 0.0 lowest kstack 0xffffffff817ec2c0

Stopped in pid 0.0 (system) at  netbsd:breakpoint+0x5:  leave

breakpoint() at netbsd:breakpoint+0x5

vpanic() at netbsd:vpanic+0x177

panic() at netbsd:panic+0x3c

bind_pirq_to_evtch() at netbsd:bind_pirq_to_evtch+0xa8

intr_establish_xname() at netbsd:intr_establish_xname+0xe2

pci_intr_establish_xname_internal() at netbsd:pci_intr_establish_xname_internal+

0xde

com_puc_attach() at netbsd:com_puc_attach+0x192

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

puc_attach() at netbsd:puc_attach+0x3a1

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

pci_probe_device1() at netbsd:pci_probe_device1+0x66c

pci_enumerate_bus1() at netbsd:pci_enumerate_bus1+0x1c8

pciattach() at netbsd:pciattach+0x18e

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

ppbattach() at netbsd:ppbattach+0x21f

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

pci_probe_device1() at netbsd:pci_probe_device1+0x66c

pci_enumerate_bus1() at netbsd:pci_enumerate_bus1+0x1c8

pciattach() at netbsd:pciattach+0x18e

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

mp_pci_scan() at netbsd:mp_pci_scan+0xd6

hypervisor_attach() at netbsd:hypervisor_attach+0x589

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_found() at netbsd:config_found+0x110

xen_mainbus_attach() at netbsd:xen_mainbus_attach+0x9c

config_attach_internal() at netbsd:config_attach_internal+0x20c

config_rootfound() at netbsd:config_rootfound+0x6e

cpu_configure() at netbsd:cpu_configure+0x25

main() at netbsd:main+0x2f1

ds          ffff

es          0

fs          180

gs          fcd0

rdi         8

rsi         1

rbp         ffffffff817efd20

rbx         ffffd48004ccc984

rdx         1

rcx         1

rax         1

r8          7

r9          75

r10         0

r11         fffffffe

r12         ffffffff80dc8e68    ostype+0x1401

r13         ffffffff817efd68

r14         104

r15         10

rip         ffffffff802378c5    breakpoint+0x5

cs          e030

rflags      202

rsp         ffffffff817efd20

ss          e02b

netbsd:breakpoint+0x5:  leave

db{0}>


Home | Main Index | Thread Index | Old Index