NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/60312: Frequent kernel panics with no apparent cause



>Number:         60312
>Category:       kern
>Synopsis:       Frequent kernel panics with no apparent cause
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jun 06 21:10:00 +0000 2026
>Originator:     Eirik Øverby
>Release:        11-prerelease (beta, rc)
>Organization:
The Floppy Museum
>Environment:
NetBSD ppro2 11.0_RC2 NetBSD 11.0_RC2 (PPRO_OverDrive) #7: Mon May 25 17:10:12 CEST 2026  ltning@motherfucker:/usr/home/ltning/github/NetBSD_clean/obj_i386/sys/arch/i386/compile/PPRO_OverDrive i386
>Description:
Note: This has been observed across several builds of 11, most frequently and recently rc2 and rc4.

The machine is a dual Pentium II OverDrive (Pentium Pro socket, Deschutes core with MMX), Intel HX (Triton II) chipset. Prior to raising this PR, a 4-day burn-in memtest86 has been run to ensure RAM is not at fault. CPU+chipset cooling has been increased to rule out heat problems. When running on 10.0 and 10.1, the system was stable for well over a year. As soon as it was upgraded to 11rc2, these kernel panics started happening frequently. 

The system runs X with DRM driver and windowmaker. pcictl list output below.

## The crashes
Crashes can happen at any time, even during low load. Anecdotal evidence suggests high load can increase the risk of it happening, e.g. during significant disk I/O or network traffic. Examples of situations that often, but not always, get interrupted by a panic+reboot include
 - sysupgrade (during download or install phases)
 - publishing fediverse posts on the local snac+nginx
 - scp/rsync-ing kernel dumps to a different system
 - invoking an editor on a file (no other load at the same time)

Local interactive use is not a trigger. It has not been determined if having X running is necessary for panics to happen.
Swapping is not required for the issue to occur.

I have 6 kernel dumps available; see https://anduin.net/~ltning/netbsd/crash/ .
File listing with descriptions:

netbsd.2
netbsd.2.core
One of the first crashes, in sched_enqueue. 11.0_RC2.

netbsd.3
netbsd.3.core
Corrupt dump; from 11.0_RC4.

netbsd.4
netbsd.4.core
Crash in pmap_sync_pv.isra.0. 11.0_RC4.
This is the only repeated crash

netbsd.5
netbsd.5.core
Crash in vflushbuf. 11.0_RC4.

netbsd.6
netbsd.6.core
Another corrupted dump. 11.0_RC2.
System panicked: pmap_zap_ptp: PTE_PVLIST with pv-untracked page va = 0xbb528000pa = 0x4e87d000(0x4e87d)

netbsd.7
netbsd.7.cfg
netbsd.7.core
netbsd.7.fullkernel
Crash in pmap_sync_pv.isra.0. 11.0_RC2.
Included full kernel (with embedded config; .fullkernel file).
This is the currently running kernel.
>How-To-Repeat:
No precise reproduction steps known.
>Fix:
No known fix. There is a plan to downgrade to 10.1 and bisect, but on this hardware and with the current failures it is unknown if I can achieve this without breaking the installation.




Home | Main Index | Thread Index | Old Index