NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/58043: kernel crash in assert_sleepable() in -current, dk(4) driver?



On Sat, 16 Mar 2024, Taylor R Campbell wrote:

Annoyingly, the part of the stack trace we really want here -- the
part which would tell us where something called pool_cache_get(_paddr)
-- has been obscured:

	assert_sleepable() at assert_sleepable+0x99
	pool_cache_get_paddr() at pool_cache_get_paddr+0x13c
	end() at ffffffff813ad275
	bdev_strategy() at bdev_strategy+0x81

My best guess from the rest of the stack trace:

	spec_strategy() at spec_strategy+0x6e
	VOP_STRATEGY() at VOP_STRATEGY+0x3c
	dkstart() at dkstart+0x13e
	dkiodone() at dkiodone+0xa6
	lddone() at lddone+0x10
	nvme_q_complete() at nvme_q_complete+0xff

is that the missing part looks something like this:

nvme_ns_dobio
ld_nvme_start
ld_diskstart
dk_start (note: not dkstart)
dk_strategy
ldstrategy

There's a call to bus_dmamap_load here which looks like, in this stack
trace, it will pass BUS_DMA_WAITOK because ld_nvme_start doesn't pass
NVME_NS_CTX_F_POLL.  I wonder whether this should unconditionally pass
BUS_DMA_NOWAIT instead?  After all, the dmamap is created with
BUS_DMA_ALLOCNOW so maybe there should be no need for allocation here.

(And I wonder whether maybe bus_dmamap_load should assert_sleepable if
you pass BUS_DMA_WAITOK, to shake out more of these paths early.)

I thought the stack trace looked like it wasn't complete!  Here is the
backtrace using gdb - perhps more useful?

#0  0xffffffff80239b95 in cpu_reboot (howto=howto@entry=256,
    bootstr=bootstr@entry=0x0)
    at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/machdep.c:708
#1  0xffffffff806a84f5 in kern_reboot (howto=howto@entry=256,
    bootstr=bootstr@entry=0x0)
    at /build/netbsd-local/src_ro/sys/kern/kern_reboot.c:91
#2  0xffffffff80588d23 in db_sync_cmd (addr=<optimized out>,
    have_addr=<optimized out>, count=<optimized out>, modif=<optimized out>)
    at /build/netbsd-local/src_ro/sys/ddb/db_command.c:1651
#3  0xffffffff805894ca in db_command (
    last_cmdp=last_cmdp@entry=0xffffd220dfd9c958)
    at /build/netbsd-local/src_ro/sys/ddb/db_command.c:970
#4  0xffffffff80589abf in db_execute_commandlist (
    cmdlist=0xffffffff80e353e0 <db_cmd_on_enter> "bt; show reg; sync")
    at /build/netbsd-local/src_ro/sys/ddb/db_command.c:466
#5  db_command_loop () at /build/netbsd-local/src_ro/sys/ddb/db_command.c:618
#6  0xffffffff8058dc98 in db_trap (type=type@entry=1, code=code@entry=0)
    at /build/netbsd-local/src_ro/sys/ddb/db_trap.c:91
#7  0xffffffff80236a54 in kdb_trap (type=type@entry=1, code=code@entry=0,
    regs=regs@entry=0xffffd220dfd9cc10)
    at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/db_interface.c:251
#8  0xffffffff8023c066 in trap (frame=0xffffd220dfd9cc10)
    at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/trap.c:314
#9  0xffffffff80234a24 in alltraps ()
#10 0xffffffff80235365 in breakpoint ()
#11 0xffffffff806ef1be in vpanic (
    fmt=fmt@entry=0xffffffff80b34a1b "%s: %s caller=%p",
    ap=ap@entry=0xffffd220dfd9cd48)
    at /build/netbsd-local/src_ro/sys/kern/subr_prf.c:286
#12 0xffffffff806ef29d in panic (
    fmt=fmt@entry=0xffffffff80b34a1b "%s: %s caller=%p")
    at /build/netbsd-local/src_ro/sys/kern/subr_prf.c:209
#13 0xffffffff8069349d in assert_sleepable ()
    at /build/netbsd-local/src_ro/sys/kern/kern_lock.c:109
#14 0xffffffff806ec0e7 in pool_cache_get_paddr (pc=0xfffff7cf1a829540,
--Type <RET> for more, q to quit, c to continue without paging--
flags=1, pap=0x0) at /build/netbsd-local/src_ro/sys/kern/subr_pool.c:2721
#15 0xffffffff813ad275 in ?? ()
#16 0x000000000000003a in ?? ()
#17 0x000000009662dc80 in ?? ()
#18 0xfffff7cf1a6644e8 in ?? ()
#19 0x000000009662dcba in ?? ()
#20 0xffffd220bf420000 in ?? ()
#21 0x0000000000001000 in ?? ()
#22 0xffffd220dfd9ce70 in ?? ()
#23 0x0000000000000100 in ?? ()
#24 0xfffff7cf1b4a85c0 in ?? ()
#25 0xffffffff813a84e0 in ?? ()
#26 0xfffff7cf1a35b478 in ?? ()
#27 0xfffff7cf1a35b360 in ?? ()
#28 0xffffd220dfd9ced0 in ?? ()
#29 0xffffffff806d8331 in bdev_strategy (bp=0xfffff7cf1b04be80)
    at /build/netbsd-local/src_ro/sys/kern/subr_devsw.c:1267
Backtrace stopped: frame did not save the PC

Can you try the attached patch?

I will test this out later today and report back.

+---------------------+--------------------------+----------------------+
| Paul Goyette (.sig) | PGP Key fingerprint:     | E-mail addresses:    |
| (Retired)           | 1B11 1849 721C 56C8 F63A | paul%whooppee.com@localhost    |
| Software Developer  | 6E2E 05FD 15CE 9F2D 5102 | pgoyette%netbsd.org@localhost  |
| & Network Engineer  |                          | pgoyette99%gmail.com@localhost |
+---------------------+--------------------------+----------------------+


Home | Main Index | Thread Index | Old Index