NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/58745: nouveau triggered assert in linux_dma_fence.c
>Number: 58745
>Category: kern
>Synopsis: nouveau triggered assert in linux_dma_fence.c
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Oct 12 01:25:00 +0000 2024
>Originator: matthew green
>Release: NetBSD 10.99.12
>Organization:
people's front against (bozotic) www (softwar foundation)
>Environment:
System: NetBSD aches.eterna23.net 10.99.12 NetBSD 10.99.12 (_aches_) #52: Sat Sep 14 15:08:46 CDT 2024 mrg%aches.eterna23.net@localhost:/var/obj/amd64-x86_64/usr/src/sys/arch/amd64/compile/_aches_ amd64
Architecture: amd64
>Description:
on a ryzen 5600G system that was recently moved from a radeon
hd 6450 to a nvidia GT 730 (the 6450 does not work with two
larger monitors), i saw this while playing a video:
[ 312382.0409707] nouveau0: autoconfiguration error: error: fifo: fault 01 [WRITE] at 00000000082e0000 engine 1b [CE2] client 18 [HUB/GR_CE] reason 02 [PTE] on channel 2 [007f952000 user]
[ 312382.0409707] nouveau0: notice: fifo: channel 2: killed
[ 312382.0409707] nouveau0: notice: fifo: runlist 0: scheduled for recovery
[ 312382.0409707] nouveau0: warn: user: channel 2 killed!
[ 312382.0409707] nouveau0: notice: fifo: engine 0: scheduled for recovery
[ 312382.0409707] nouveau0: notice: fifo: engine 7: scheduled for recovery
at this point, X crashed, and as i've seen this on 710, 730,
and 1030 cards and been able to restart, i restarted X but it
then died immediately:
[ 313138.7008055] nouveau0: autoconfiguration error: error: gr: TRAP ch 7 [007f7bc000 user]
[ 313138.7008055] nouveau0: autoconfiguration error: error: gr: GPC0/TPC0/TEX: 80000041
[ 313138.7008055] nouveau0: autoconfiguration error: error: gr: GPC0/TPC1/TEX: 80000041
[ 313138.7008055] nouveau0: autoconfiguration error: error: fifo: fault 00 [READ] at 0000000000ed2000 engine 00 [GR] client 04 [GPC0/T1_1] reason 02 [PTE] on channel 7 [007f7bc000 user]
[ 313138.7008055] nouveau0: notice: fifo: channel 7: killed
[ 313138.7008055] nouveau0: notice: fifo: runlist 0: scheduled for recovery
[ 313138.7008055] nouveau0: notice: fifo: engine 0: scheduled for recovery
[ 313138.7008055] nouveau0: warn: user: channel 7 killed!
and then:
panic: kernel diagnostic assertion "(atomic_load_relaxed(&fence->flags) & (1u << DMA_FENCE_FLAG_SIGNALED_BIT)) == 0" failed: file "/usr/src/sys/external/bsd/drm2/linux/linux_dma_fence.c", line 696
it may be the assert is checking something that linux does not
enforce, and an uncommon error code path we haven't seen before.
i have had this error not restart but simply fail and need a
reboot to work properly, but not crash before. eg, perhaps the
fence was signaled _twice_ in the error case, and the second one
is triggering the assert.
i have a core file and a netbsd.gdb for this one. the backtrace
is not especially interesting i think:
vpanic() at vpanic+0x17b
kern_assert() at __x86_indirect_thunk_rax
linux_dma_fence_set_error() at linux_dma_fence_set_error+0x160
nouveau_fence_context_kill() at nouveau_fence_context_kill+0x44
nouveau_channel_killed() at nouveau_channel_killed+0x5c
nvif_notify_work() at nvif_notify_work+0x2b
linux_workqueue_thread() at linux_workqueue_thread+0x154
>How-To-Repeat:
>Fix:
Home |
Main Index |
Thread Index |
Old Index