NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/60029: panic: cpu0: softints stuck for 16 seconds
The following reply was made to PR kern/60029; it has been noted by GNATS.
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Thomas Klausner <wiz%NetBSD.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost, chs%NetBSD.org@localhost
Subject: Re: kern/60029: panic: cpu0: softints stuck for 16 seconds
Date: Sun, 22 Feb 2026 19:24:59 +0000
> Date: Sun, 22 Feb 2026 14:02:33 +0100
> From: Thomas Klausner <wiz%netbsd.org@localhost>
>
> 0 > 935 7 16 40200 ffff8fb6be107400 pgdaemon
> [...]
> 0 > 4 7 0 200 ffff8fc60b25f800 softbio/0
These two look suspicious.
Now, I think the stack traces won't be reliable because the threads in
question are running on the CPU (so we really want `mach cpu N' and
`bt', but PR bin/58010: crash(8) doesn't support `mach cpu N' to
examine registers/stack of other CPUs), but just in case, here's the
stack trace for softbio/0:
> crash> bt/a ffff8fc60b25f800
> trace: pid 0 lid 4 at 0xffffa72469e83fa0
> uvm_aio_aiodone() at uvm_aio_aiodone+0xbe
> dkiodone() at dkiodone+0xa8
> lddone() at lddone+0xf
> nvme_q_complete() at nvme_q_complete+0xf2
> softint_dispatch() at softint_dispatch+0x112
If you have netbsd.gdb, can you get output of `info line
*(uvm_aio_aiodone+0xbe)' in gdb from it?
And can you get the stack trace for pgdaemon too?
crash> bt/a ffff8fb6be107400
My guess is the softbio thread might be waiting for this lock:
528 if (write && (bp->b_cflags & BC_AGE) != 0) {
-> 529 mutex_enter(bp->b_objlock);
530 vwakeup(bp);
531 mutex_exit(bp->b_objlock);
532 }
https://nxr.netbsd.org/xref/src/sys/uvm/uvm_pager.c?r=1.131#529
This is probably either buffer_lock or vp->v_interlock for some vnode.
In case it's buffer_lock, can you show this?
crash> x/Lx buffer_lock
(We should really record in struct lwp or struct cpu_info what lock a
thread on the CPU is spinning for, so that we can just find who holds
it right now.)
Since they're not _sleeping_ on a wait-channel, whoever holds the lock
(whether it's buffer_lock or vp->v_interlock) is probably on running
on another CPU. That narrows it down to:
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
20857>20857 7 17 8060000 ffff8fa8dbeac000 moc
2482 > 2482 7 23 8060000 ffff8fb50065c800 moc
26437>26437 7 9 8060000 ffff8fb50065c000 moc
12057>12057 7 31 8020000 ffff8fbf931ab000 moc
18565>18565 7 21 8060000 ffff8fc129ed9c00 moc
12544>12544 7 22 8060000 ffff8fc129ed9000 moc
13508>13508 7 2 8060000 ffff8fb908ee8800 moc
3522 > 3522 7 20 8020000 ffff8fc1d885ac00 moc
23657>23657 7 3 8020000 ffff8fb90950c800 moc
3031 > 3031 7 28 8060000 ffff8fb10d2b3400 moc
13759>13759 7 5 8020000 ffff8faac5fec000 moc
25301>25301 7 15 8060000 ffff8fc2d6fb1400 moc
13934>13934 7 14 8060000 ffff8fb8f1b68800 moc
9619 > 9619 7 11 8060000 ffff8fb8f1b68000 moc
11474>11474 7 13 8020000 ffff8fb7c17d2c00 moc
23339>23339 7 29 8060000 ffff8fb7b9e4cc00 moc
5510 > 5510 7 10 8060000 ffff8fc36cf3ec00 moc
2567 > 2567 7 27 8020000 ffff8fc331b27c00 moc
29924>29924 7 4 8020000 ffff8fc561094c00 moc
20557>20557 7 26 8020000 ffff8fc561094800 moc
23674>23674 7 6 8060000 ffff8fc38a557000 moc
27935>27935 7 25 8060000 ffff8faaa08c1000 moc
19811>19811 7 7 8060000 ffff8fbdc9009800 moc
14383>14383 7 18 8060000 ffff8fbac25c2000 moc
19788>19788 7 19 8060000 ffff8fc08f854c00 moc
11290>11290 7 24 8060000 ffff8fb0ea2d1800 moc
28134>28134 7 30 8020000 ffff8fc2010df800 moc
27882>27882 7 8 8060000 ffff8fbd4892ac00 moc
21709>21709 7 12 8060000 ffff8fb7cd8e5400 moc
29702>29702 7 1 8020000 ffff8fb90a69dc00 moc
0 > 935 7 16 40200 ffff8fb6be107400 pgdaemon
0 > 4 7 0 200 ffff8fc60b25f800 softbio/0
Could try getting the stack traces for the moc processes but they're
probably in userland without any locks held...
Home |
Main Index |
Thread Index |
Old Index