NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

port-xen/56063: Xen boot fails with "heap full"



>Number:         56063
>Category:       port-xen
>Synopsis:       Xen boot fails with "heap full"
>Confidential:   no
>Severity:       critical
>Priority:       low
>Responsible:    port-xen-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 18 14:20:00 +0000 2021
>Originator:     Andreas Gustafsson
>Release:        NetBSD 8.2, 9.0, 9.1
>Organization:
  
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:

When I try to boot any NetBSD release newer than 8.1 as a Xen 4.11
dom0 on a HP DL360 G7 server, the boot fails with a "heap full"
message.  For example, with 9.1:

  >> NetBSD/x86 BIOS Boot, Revision 5.11 (Sun Oct 18 19:24:30 UTC 2020) (from NetBSD 9.1)
  >> Memory: 637/3668992 k

       1. Xen
       2. Boot normally
       3. Boot single user
       4. Drop to boot prompt

  Choose an option; RETURN for default; SPACE to stop countdown.
  Option 1 will be chosen in 0 seconds. 4 seconds. 3 seconds. 2 seconds. 1 seconds. 0 seconds. 0 seconds.
  |/-\|/-\|/-\|/-\2666632|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|+1339256/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|=0x3d20ec
  /-\|/-\|/-\|/-\|Loading /netbsd-XEN3_DOM0.gz /-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/heap full (0x6a800+32768)

The boot blocks are the freshly installed ones of the OS version in case.

I bisected this on the -8 branch, and was able to narrow down the time
when the bug appeared on the branch to the interval that started with
the pullup

  2019.09.18.16.30.33 martin src/sys/arch/x86/acpi/acpi_machdep.c 1.18.6.1

and ended with

  2019.10.04.11.34.18 martin src/sys/arch/i386/stand/pxeboot/start_pxe.S 1.6.48.1

I am unable to narrow it down further because the pullup at
2019.09.18.16.30.33 broke the build, and when the build was fixed,
PXE booting (which my automated test relies on) was broken until fixed
at 2019.10.04.11.34.18.

I understand that others have been able to boot the versions that are
not working for me, so this is probably hardware or firmware dependent
in some way.  For example, given the nature of the first pullup, it
could have something to do with the contents of the ACPI tables.

I am marking this critical because the system doesn't even boot, yet
low priority because I just happened to run into it in the course of
unrelated testing and don't have any actual plans to run Xen on the
machine in case for any purpose other than to demonstrate that it
doesn't work.  If the bug impacts you, feel free to update the
priority accordingly.

>How-To-Repeat:

>Fix:



Home | Main Index | Thread Index | Old Index