NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/55839: nvme(4) panic on amd64 9/99/76 when loaded as a module

>Number:         55839
>Category:       kern
>Synopsis:       nvme(4) panic on amd64 9/99/76 when loaded as a module
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Dec 03 22:40:00 +0000 2020
>Originator:     Paul Goyette
>Release:        NetBSD 9.99.76
| Paul Goyette       | PGP Key fingerprint:     | E-mail addresses:     |
| (Retired)          | FA29 0E3B 35AF E8AE 6651 |     |
| Software Developer | 0786 F758 55DE 53BA 7731 |   |
System: NetBSD 9.99.76 NetBSD 9.99.76 (SPEEDY 2020-12-03 16:50:31 UTC) #0: Thu Dec  3 19:39:16 UTC 2020 amd64
Architecture: x86_64
Machine: amd64
	This machine contains a nvme(4) device:

	 nvme0 at pci3 dev 0 function 0: Samsung Electronics (3rd vendor ID) product a804 (rev. 0x00)
	 nvme0: NVMe 1.2
	 nvme0: for admin queue interrupting at msix6 vec 0
	 nvme0: Samsung SSD 960 PRO 512GB, firmware 2B6QCXP7, serial S3EWNX0K108171P

	When loading the nvme(4) module using modload(8), I get the
	following crash:

	panic: kernel diagnostic assertion "ns->ident == NULL" failed: file "/build/netbsd-local/src_ro/sys/dev/ic/nvme.c", line 670

	Backtrace shows:

	vpanic() at vpanic+0x156
	__x86_indirect_thunk_rax() at __x86_indirect_thunk_rax
	nvme_ns_identify() at nvme_ns_identify+0x24f
	nvme_rescan() at nvme_rescan+0xc0
	config_cfdata_attach() at config_cfdata_attach+0xc3
	config_init_component() at config_init_component+0x7a
	module_do_load() at module_do_load+0x5c9
	module_load() at module_load+0x85
	handle_modctl_load() at handle_modctl_load+0x157
	sys_modctl() at sys_modctl+0x324
	syscall() at syscall+0x23e
	--- syscall (number 246) ---

	gdb shows

	0x1d0f is in nvme_ns_identify (/build/netbsd-local/src_ro/sys/dev/ic/nvme.c:637).
	632             KASSERT(ccb != NULL); /* it's a bug if we don't have spare ccb here */
	634             mem = nvme_dmamem_alloc(sc, sizeof(*identify));
	635             if (mem == NULL) {
	636                     nvme_ccb_put(sc->sc_admin_q, ccb);
	637                     return ENOMEM;
	638             }
	640             memset(&sqe, 0, sizeof(sqe));
	641             sqe.opcode = NVM_ADMIN_IDENTIFY;

	gdb seems unable to disassemble things, but objdump shows

	0000000000001ac0 <nvme_ns_identify>:
	    1ac0:       55                      push   %rbp
	    1ac1:       48 89 e5                mov    %rsp,%rbp
	    1ac4:       41 57                   push   %r15
	    1ac6:       41 56                   push   %r14
	    1d08:       31 c0                   xor    %eax,%eax
	    1d0a:       e8 00 00 00 00          callq  1d0f <nvme_ns_identify+0x24f>
	                        1d0b: R_X86_64_PLT32    kern_assert-0x4
	    1d0f:       4c 8b 5d 80             mov    -0x80(%rbp),%r11
	    1d13:       e9 2e ff ff ff          jmpq   1c46 <nvme_ns_identify+0x186>

	0000000000001d18 <nvme_rescan>:

	(A crash-dump file and kernel-with-symbol-table is available for
	further investigation, if needed.)
	Boot a 9.99.76 amd64 kernel built _without_ built-in nvme module,
	and then try to load the module.  (It is unknown if the problem
	occurs with built-in nvme module.)
	No fix currently known.


Home | Main Index | Thread Index | Old Index