NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/59339: heartbeat watchdog fires since 10.99.14



The following reply was made to PR kern/59339; it has been noted by GNATS.

From: Patrick Welche <prlw1%welche.eu@localhost>
To: gnats-bugs%netbsd.org@localhost, matthew green <mrg%eterna23.net@localhost>
Cc: 
Subject: Re: kern/59339: heartbeat watchdog fires since 10.99.14
Date: Tue, 22 Apr 2025 10:29:26 +0100

 On Mon, Apr 21, 2025 at 10:00:02PM +0000, matthew green via gnats wrote:
 > The following reply was made to PR kern/59339; it has been noted by GNATS.
 > 
 > From: matthew green <mrg%eterna23.net@localhost>
 > To: gnats-bugs%netbsd.org@localhost, prlw1%cam.ac.uk@localhost
 > Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
 >     netbsd-bugs%netbsd.org@localhost
 > Subject: re: kern/59339: heartbeat watchdog fires since 10.99.14
 > Date: Tue, 22 Apr 2025 07:54:55 +1000
 > 
 >  > System panicked: cpu0: softints stuck for 16 seconds
 >  
 >  this means cpu0 is locked up, and some other cpu detected it and
 >  crashed.  the stack below is not the interesting cpu, but you
 >  found the relevant LWPs to inspect:
 >  
 >  > crash> bt
 >  > end() at 0
 >  > kern_reboot() at kern_reboot+0x93
 >  > vpanic() at vpanic+0x16b
 >  > panic() at vprintf
 >  > heartbeat() at heartbeat+0x1f2
 >  > hardclock() at hardclock+0x9c
 >  > Xresume_lapic_ltimer() at Xresume_lapic_ltimer+0x1e
 >  > --- interrupt ---
 >  > mutex_spin_exit() at mutex_spin_exit+0x5a
 >  > callout_softclock() at callout_softclock+0xad
 >  > softint_dispatch() at softint_dispatch+0x8f
 >  > crash> ps
 >  > PID     LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 >  > 2917 > 2917 7   0   8060000   ffff8052e4a14000                tar
 >  > 0    >    5 7   0       200   ffff8055abee1c00          softclk/0
 >  
 >  can you do "bt/a ffff8052e4a14000" and "bt/a ffff8055abee1c00"?
 
 tar:
 crash> bt/a ffff8052e4a14000
 crash: _kvm_kvatop(7f7fffffe068)
 crash: kvm_read(0x7f7fffffe068, 8): invalid translation (invalid level 4 PDE)
 trace: pid 2917 lid 2917
 
 softclk:
 crash> bt/a ffff8055abee1c00
 trace: pid 0 lid 5 at 0xffffc404a4608f90
 callout_softclock() at callout_softclock+0xad
 softint_dispatch() at softint_dispatch+0x8f
 
 >  or with the other crash, any process on the cpu reported (always
 >  cpu0, i think?) with the ">" state like above (ie, running.)
 
 cron:
 crash> bt/a ffff81f02e1a7800
 trace: pid 1266 lid 1266 at 0xffff9704b6f54e90
 sys___nanosleep50() at sys___nanosleep50+0x46
 syscall() at syscall+0x95
 --- syscall (number 430) ---
 syscall+0x95:
 
 X:
 crash> bt/a ffff81f026788c00
 trace: pid 992 lid 992 at 0xffff9704b65847a0
 uvm_pglistalloc() at uvm_pglistalloc+0x9de
 _bus_dmamem_alloc_range.constprop.0() at _bus_dmamem_alloc_range.constprop.0+0x6
 4
 bus_dmamem_alloc() at bus_dmamem_alloc+0x78
 i915_gem_object_get_pages_internal() at i915_gem_object_get_pages_internal+0xad
 ____i915_gem_object_get_pages() at ____i915_gem_object_get_pages+0x1f
 __i915_gem_object_get_pages() at __i915_gem_object_get_pages+0x45
 pool_active() at pool_active+0x68
 i915_active_acquire() at i915_active_acquire+0x7d
 intel_engine_get_pool() at intel_engine_get_pool+0x223
 i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0xc71
 i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1be
 drm_ioctl() at drm_ioctl+0x241
 drm_ioctl_shim() at drm_ioctl_shim+0x30
 sys_ioctl() at sys_ioctl+0x4ae
 syscall() at syscall+0x95
 --- syscall (number 54) ---
 syscall+0x95:
 
 xdm:
 crash> bt/a ffff81f027510400
 trace: pid 857 lid 857 at 0x8
 _KERNEL_OPT_DDB_HISTORY_SIZE() at _KERNEL_OPT_DDB_HISTORY_SIZE+0x46
 crash: _kvm_kvatop(10)
 crash: kvm_read(0x10, 8): invalid translation (invalid level 4 PDE)
 
 softclk:
 crash> bt/a ffff81f020f56400
 trace: pid 0 lid 27 at 0x680692b5
 __LARGE_PAGE_SIZE() at 6aa46bf
 crash: _kvm_kvatop(680692bd)
 crash: kvm_read(0x680692bd, 8): invalid translation (invalid level 4 PDE)
 


Home | Main Index | Thread Index | Old Index