NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/56978: nvme hangs under very heavy loads



>Number:         56978
>Category:       kern
>Synopsis:       nvme hangs under very heavy loads
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 24 14:25:00 +0000 2022
>Originator:     Paul Goyette
>Release:        NetBSD 9.99.99
>Organization:
+--------------------+--------------------------+----------------------+
| Paul Goyette       | PGP Key fingerprint:     | E-mail addresses:    |
| (Retired)          | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost    |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette%netbsd.org@localhost  |
| & Network Engineer |                          | pgoyette99%gmail.com@localhost |
+--------------------+--------------------------+----------------------+
>Environment:
System: NetBSD speedy.whooppee.com 9.99.99 NetBSD 9.99.99 (SPEEDY 2022-08-22 19:31:52 UTC) #0: Tue Aug 23 07:05:11 UTC 2022 paul%speedy.whooppee.com@localhost:/build/netbsd-local/obj/amd64/sys/arch/amd64/compile/SPEEDY amd64
Architecture: x86_64
Machine: amd64
>Description:
	Under very high loads, the nvme driver seems to hang waiting
	for an i/o completion that never happens (or is somehow not
	seen).  Symptoms are zero or one process waiting for i/o
	completion (wchan = biolock), several more processes waiting
	on wchan = biowait, and some generally large number of procs
	hanging in tsile.

	Debugging has shown that some number of nvme queues exhibit
	large gaps between the queue-head and queue-tail indexes.

	For me, this is easily reproducible by running three copies
	of ``build.sh release'', each in its own tree, and all output
	files (in obj, destdir, tools, and release) are directed to
	the same nvme.   All source directories are on the same nvme
	and are null-mounted read-only on top of the read-write
	directories.

	Once the hang occurs, the system is still useable, so long
	as you don't touch the busy nvme device.  I've been able to
	reproduce this on both GENERIC and custom kernel configs,
	and have successfully been able to get crash dumps and/or
	to run gdb(1) on the running kernel.

	Here's the portions of dmesg related to the nvmes (manual
	line-breaks inserted for readability).  The troublesome nvme
	.is nvme1 
		...
		[     1.020867] nvme0 at pci2 dev 0 function 0:
			 vendor 144d product a804 (rev. 0x00)
		[     1.020867] nvme0: NVMe 1.2
		...
		[     1.020867] ld0 at nvme0 nsid 1
		[     1.020867] ld0: 476 GB, 62260 cyl, 255 head,
			63 sec, 512 bytes/sect x 1000215216 sectors
		...
		[     1.020867] nvme1 at pci5 dev 0 function 0:
			vendor 144d product a808 (rev. 0x00)
		[     1.020867] nvme1: NVMe 1.3
		...
		[     1.020867] ld1 at nvme1 nsid 1
		[     1.020867] ld1: 1863 GB, 243201 cyl, 255 head,
			63 sec, 512 bytes/sect x 3907029168 sectors
		...
		[     1.019791] nvme2 at pci6 dev 0 function 0:
			vendor 144d product a80a (rev. 0x00)
		[     1.019791] nvme2: NVMe 1.3
		...
		[     1.019791] ld2 at nvme2 nsid 1
		[     1.019791] ld2: 1863 GB, 243201 cyl, 255 head,
		63 sec, 512 bytes/sect x 3907029168 sectors
		...

		nvme0 is 512GB Samsung 960 PRO 
		nvme1 is 2TB   Samsung 970 EVO
		nvme2 is 2TB   Samsung 980 PRO

	In order to eliminate possible hardware problems, I moved
	everything from nvme1 (970 EVO) to nvme2 (980 PRO).  The
	problem still occurs, with the same symptoms as above.
>How-To-Repeat:
	See above
>Fix:
	Don't know, but maybe should be a blocker for -10 release?



Home | Main Index | Thread Index | Old Index