NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/58043: kernel crash in assert_sleepable() in -current, dk(4) driver?



The following reply was made to PR kern/58043; it has been noted by GNATS.

From: Paul Goyette <paul%whooppee.com@localhost>
To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Cc: gnats-bugs%NetBSD.org@localhost, netbsd-bugs%NetBSD.org@localhost
Subject: Re: kern/58043: kernel crash in assert_sleepable() in -current,
 dk(4) driver?
Date: Sat, 16 Mar 2024 10:18:36 -0700 (PDT)

 On Sat, 16 Mar 2024, Taylor R Campbell wrote:
 
 > Annoyingly, the part of the stack trace we really want here -- the
 > part which would tell us where something called pool_cache_get(_paddr)
 > -- has been obscured:
 >
 >> 	assert_sleepable() at assert_sleepable+0x99
 >> 	pool_cache_get_paddr() at pool_cache_get_paddr+0x13c
 >> 	end() at ffffffff813ad275
 >> 	bdev_strategy() at bdev_strategy+0x81
 >
 > My best guess from the rest of the stack trace:
 >
 >> 	spec_strategy() at spec_strategy+0x6e
 >> 	VOP_STRATEGY() at VOP_STRATEGY+0x3c
 >> 	dkstart() at dkstart+0x13e
 >> 	dkiodone() at dkiodone+0xa6
 >> 	lddone() at lddone+0x10
 >> 	nvme_q_complete() at nvme_q_complete+0xff
 >
 > is that the missing part looks something like this:
 >
 > nvme_ns_dobio
 > ld_nvme_start
 > ld_diskstart
 > dk_start (note: not dkstart)
 > dk_strategy
 > ldstrategy
 >
 > There's a call to bus_dmamap_load here which looks like, in this stack
 > trace, it will pass BUS_DMA_WAITOK because ld_nvme_start doesn't pass
 > NVME_NS_CTX_F_POLL.  I wonder whether this should unconditionally pass
 > BUS_DMA_NOWAIT instead?  After all, the dmamap is created with
 > BUS_DMA_ALLOCNOW so maybe there should be no need for allocation here.
 >
 > (And I wonder whether maybe bus_dmamap_load should assert_sleepable if
 > you pass BUS_DMA_WAITOK, to shake out more of these paths early.)
 
 I thought the stack trace looked like it wasn't complete!  Here is the
 backtrace using gdb - perhps more useful?
 
 #0  0xffffffff80239b95 in cpu_reboot (howto=howto@entry=256,
      bootstr=bootstr@entry=0x0)
      at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/machdep.c:708
 #1  0xffffffff806a84f5 in kern_reboot (howto=howto@entry=256,
      bootstr=bootstr@entry=0x0)
      at /build/netbsd-local/src_ro/sys/kern/kern_reboot.c:91
 #2  0xffffffff80588d23 in db_sync_cmd (addr=<optimized out>,
      have_addr=<optimized out>, count=<optimized out>, modif=<optimized out>)
      at /build/netbsd-local/src_ro/sys/ddb/db_command.c:1651
 #3  0xffffffff805894ca in db_command (
      last_cmdp=last_cmdp@entry=0xffffd220dfd9c958)
      at /build/netbsd-local/src_ro/sys/ddb/db_command.c:970
 #4  0xffffffff80589abf in db_execute_commandlist (
      cmdlist=0xffffffff80e353e0 <db_cmd_on_enter> "bt; show reg; sync")
      at /build/netbsd-local/src_ro/sys/ddb/db_command.c:466
 #5  db_command_loop () at /build/netbsd-local/src_ro/sys/ddb/db_command.c:618
 #6  0xffffffff8058dc98 in db_trap (type=type@entry=1, code=code@entry=0)
      at /build/netbsd-local/src_ro/sys/ddb/db_trap.c:91
 #7  0xffffffff80236a54 in kdb_trap (type=type@entry=1, code=code@entry=0,
      regs=regs@entry=0xffffd220dfd9cc10)
      at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/db_interface.c:251
 #8  0xffffffff8023c066 in trap (frame=0xffffd220dfd9cc10)
      at /build/netbsd-local/src_ro/sys/arch/amd64/amd64/trap.c:314
 #9  0xffffffff80234a24 in alltraps ()
 #10 0xffffffff80235365 in breakpoint ()
 #11 0xffffffff806ef1be in vpanic (
      fmt=fmt@entry=0xffffffff80b34a1b "%s: %s caller=%p",
      ap=ap@entry=0xffffd220dfd9cd48)
      at /build/netbsd-local/src_ro/sys/kern/subr_prf.c:286
 #12 0xffffffff806ef29d in panic (
      fmt=fmt@entry=0xffffffff80b34a1b "%s: %s caller=%p")
      at /build/netbsd-local/src_ro/sys/kern/subr_prf.c:209
 #13 0xffffffff8069349d in assert_sleepable ()
      at /build/netbsd-local/src_ro/sys/kern/kern_lock.c:109
 #14 0xffffffff806ec0e7 in pool_cache_get_paddr (pc=0xfffff7cf1a829540,
 --Type <RET> for more, q to quit, c to continue without paging--
      flags=1, pap=0x0) at 
 /build/netbsd-local/src_ro/sys/kern/subr_pool.c:2721
 #15 0xffffffff813ad275 in ?? ()
 #16 0x000000000000003a in ?? ()
 #17 0x000000009662dc80 in ?? ()
 #18 0xfffff7cf1a6644e8 in ?? ()
 #19 0x000000009662dcba in ?? ()
 #20 0xffffd220bf420000 in ?? ()
 #21 0x0000000000001000 in ?? ()
 #22 0xffffd220dfd9ce70 in ?? ()
 #23 0x0000000000000100 in ?? ()
 #24 0xfffff7cf1b4a85c0 in ?? ()
 #25 0xffffffff813a84e0 in ?? ()
 #26 0xfffff7cf1a35b478 in ?? ()
 #27 0xfffff7cf1a35b360 in ?? ()
 #28 0xffffd220dfd9ced0 in ?? ()
 #29 0xffffffff806d8331 in bdev_strategy (bp=0xfffff7cf1b04be80)
      at /build/netbsd-local/src_ro/sys/kern/subr_devsw.c:1267
 Backtrace stopped: frame did not save the PC
 
 > Can you try the attached patch?
 
 I will test this out later today and report back.
 
 +---------------------+--------------------------+----------------------+
 | Paul Goyette (.sig) | PGP Key fingerprint:     | E-mail addresses:    |
 | (Retired)           | 1B11 1849 721C 56C8 F63A | paul%whooppee.com@localhost    |
 | Software Developer  | 6E2E 05FD 15CE 9F2D 5102 | pgoyette%netbsd.org@localhost  |
 | & Network Engineer  |                          | pgoyette99%gmail.com@localhost |
 +---------------------+--------------------------+----------------------+
 


Home | Main Index | Thread Index | Old Index