NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/58118: NetBSD/i386 Xen PV guest panic "fpudna from userland"



>Number:         58118
>Category:       kern
>Synopsis:       NetBSD/i386 Xen PV guest panic "fpudna from userland"
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Apr 05 17:30:01 +0000 2024
>Originator:     brad%anduin.eldar.org@localhost
>Release:        NetBSD 10.0
>Organization:
	Eldar.org
>Environment:
System: NetBSD meriadoc.nat.eldar.org 10.0 NetBSD 10.0 (XEN3PAE_DOMU) #0: Sat Mar 30 22:44:50 EDT 2024  brad%samwise.nat.eldar.org@localhost:/lhome/NetBSD_10_branch_20240328/i386/OBJ/sys/arch/i386/compile/XEN3PAE_DOMU i386
Architecture: x86_64
Machine: i386
>Description:

A i386 Xen PV+PVSHIM newly updated to 10.0 release (self built
release, using XEN3_DOMU i386 PAE kernel) is panicing with the
following:

[ 89%] Building CXX object Tests/CMakeLib/CMakeFiles/CMakeLibTests.dir/testRange.cxx.o
[ 89%] Building CXX object Tests/CMakeLib/CMakeFiles/CMakeLibTests.dir/testOptional.cxx.o
[ 89%] Building CXX object Tests/CMakeLib/CMakeFiles/CMakeLibTests.dir/testString.cxx.o
[ 89%] Building CXX object Tests/CMakeLib/CMakeFiles/CMakeLibTests.dir/testStringAlgorithms.cxx.o
[ 89%] Building CXX object Tests/CMakeLib/CMakeFiles/CMakeLibTests.dir/testSystemTools.cxx.o
[ 19585.9365064] panic: fpudna from userland, ip 0xbbe74f, trapframe 0xdefc3fa8
[ 19585.9365064] cpu0: Begin traceback...
[ 19585.9365064] vpanic(c0554d08,defc3f8c,defc3f9c,c01322bb,c0554d08,c0554d2c,bbe74f,defc3fa8,1a61000,bf7fcdcc) at netbsd:vpanic+0x18e
[ 19585.9365064] panic(c0554d08,c0554d2c,bbe74f,defc3fa8,1a61000,bf7fcdcc,c0102f9e,defc3fa8,bb4900b3,bf7f00ab) at netbsd:panic+0x18
[ 19585.9365064] fpudna(defc3fa8,bb4900b3,bf7f00ab,bf7f001f,bf7f001f,b9e08b80,b8574d40,bf7fcdcc,1a61000,0) at netbsd:fpudna+0x3b
[ 19585.9365064] cpu0: End traceback...

[ 19585.9365064] dumping to dev 142,9 offset 8
[ 19585.9365064] dump uvm_fault(0xc07423e0, 0xfe4ef000, 2) -> 0xe
[ 19585.9365064] fatal page fault in supervisor mode
[ 19585.9365064] trap type 6 code 0x2 eip 0xc012be9d cs 0x9 eflags 0x10202 cr2 0xfe4effff ilevel 0x8 esp 0xc0614b00
[ 19585.9365064] curlwp 0xc749a680 pid 12729 lid 12729 lowest kstack 0xdefc22c0
[ 19585.9365064] Skipping crash dump on recursive panic
[ 19585.9365064] panic: trap
[ 19585.9365064] cpu0: Begin traceback...
[ 19585.9365064] vpanic(c0554a1f,defc3dbc,defc3e78,c0130c71,c0554a1f,defc3e84,defc3e84,31b9,defc22c0,10202) at netbsd:vpanic+0x18e
[ 19585.9365064] panic(c0554a1f,defc3e84,defc3e84,31b9,defc22c0,10202,fe4effff,8,c0614b00,fe4effff) at netbsd:panic+0x18
[ 19585.9365064] trap() at netbsd:trap+0xcc1
[ 19585.9365064] --- trap (number 6) ---
[ 19585.9365064] dodumpsys(b9e08b80,104,0,c012e92d,8,0,5,0,0,1) at netbsd:dodumpsys+0x44d
[ 19585.9365064] dumpsys(104,0,c749a680,c0554cb6,5,defc3f70,c040235c,104,0,0) at netbsd:dumpsys+0x14
[ 19585.9365064] kern_reboot(104,0,0,0,c0747d00,c056b196,b8574d40,defc3f80,c0402418,c0554d08) at netbsd:kern_reboot+0x78
[ 19585.9365064] vpanic(c0554d08,defc3f8c,defc3f9c,c01322bb,c0554d08,c0554d2c,bbe74f,defc3fa8,1a61000,bf7fcdcc) at netbsd:vpanic+0x19c
[ 19585.9365064] panic(c0554d08,c0554d2c,bbe74f,defc3fa8,1a61000,bf7fcdcc,c0102f9e,defc3fa8,bb4900b3,bf7f00ab) at netbsd:panic+0x18
[ 19585.9365064] fpudna(defc3fa8,bb4900b3,bf7f00ab,bf7f001f,bf7f001f,b9e08b80,b8574d40,bf7fcdcc,1a61000,0) at netbsd:fpudna+0x3b
[ 19585.9365064] cpu0: End traceback...
[ 19585.9365064] rebooting...

The system is a Xen NetBSD/i386 build guest and was compiling some
pkgsrc 2024Q1 packages, in particular, working on cmake on the way to
doing emacs 29 (I believe).  This was the second time that this panic
was noted.  The first time was early in the morning, perhaps during
the daily cron runs.

The guest has 1 vcpu and is running in PV+PVSHIM mode with 4GB of
memory.  When the guest was a 9.x system, it ran fine.  The system is
running a self built 10.0 release from 2024-03-28.

>How-To-Repeat:

Not completely sure...  the system had built quite a number of
packages before the panic, so it might up uptime related (i.e. memory
leak).  Cmake does require quite a bit of resources, so it could be
resource related.  But it is probably something else.

If someone finds it useful, I can set the system to enter DDB on panic
and poke around (with a recipe of instructions, optimally).

I highly expect that it will panic again once I restart the builds.

(BTW - the build was restarted and it made it past the point in cmake,
so I think that it can be said that cmake can't reproduce this on its
own)

[A variable that can't be tried as a workaround with i386 guests would
be to run them in pure PVH mode with the GENERIC kernel.  The guest
will boot but in my experience will hang if there is significant disk
activity (untaring a set or two will usually trip it for me).  I
openened a PR about that topic some time ago..  I may try this again
anyway, if I get another panic soon.]

The DOM0 that this guest is running on is a 4.15 using NetBSD
9.3_STABLE/amd64.

>Fix:

Don't know, but I hope a fix will come along as I have several Xen
i386 guests that are important to me and it would be great if they
could run 10.x.



Home | Main Index | Thread Index | Old Index