NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/54592: uhub error when nvme is using msi(x) interrupts



>Number:         54592
>Category:       kern
>Synopsis:       uhub error when nvme is using msi(x) interrupts
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 02 10:55:00 +0000 2019
>Originator:     Thomas Klausner
>Release:        NetBSD 9.99.15
>Organization:
Curiosity is the very basis of education and if you tell me that
curiosity killed the cat, I say only that the cat died nobly.
- Arnold Edinborough
>Environment:
	
	
Architecture: x86_64
Machine: amd64
>Description:
On this system:
[     1.000746] xhci2 at pci13 dev 0 function 3: vendor 1022 product 145f (rev. 0x00)
[     1.000746] allocated pic msix8 type edge pin 0 level 6 to cpu1 slot 2 idt entry 140
[     1.000746] xhci2: interrupting at msix8 vec 0
[     1.000746] xhci2: xHCI version 1.0
[     1.000746] usb4 at xhci2: USB revision 3.0
[     1.000746] usb5 at xhci2: USB revision 2.0
[     1.000746] usb5 at xhci2: USB revision 2.0
[     2.549742] uhub5 at usb5: NetBSD (0000) xHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 0
[     2.549742] uhub5: 4 ports with 4 removable, self powered

I see
[     9.153939] uhub5: autoconfiguration error: device problem, disabling port 2
[    15.047687] uhub5: autoconfiguration error: device problem, disabling port 3

which disables the USB (console) keyboard and mouse.

This diff to the nvme driver:

Index: nvme_pci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/pci/nvme_pci.c,v
retrieving revision 1.26
diff -u -r1.26 nvme_pci.c
--- nvme_pci.c  23 Jan 2019 06:56:19 -0000      1.26
+++ nvme_pci.c  10 Jun 2019 08:18:33 -0000
@@ -64,7 +64,7 @@
 #include <dev/ic/nvmereg.h>
 #include <dev/ic/nvmevar.h>

-int nvme_pci_force_intx = 0;
+int nvme_pci_force_intx = 1;
 int nvme_pci_mpsafe = 1;
 int nvme_pci_mq = 1;           /* INTx: ioq=1, MSI/MSI-X: ioq=ncpu */

which I needed to make nvme work for some months, makes uhub5 work(!?).
The diff is not necessary any longer since this commit:

Author: nonaka <nonaka%NetBSD.org@localhost>
Date:   Fri Sep 20 05:32:42 2019 +0000

    Don't set Phase Tag bit of Completion Queue entry at nvme_poll_done().

    A new completion queue entry check incorrectly determined that there was
    a Completion Queue entry for a command that was not submitted.

    Fix PR kern/54275, PR kern/54503, PR kern/54532.

which made the uhub issue visible.

I've backported the diff to a July 19 (8.99.51) kernel and uhub failed then already.

Here's the dmesg output for nvme:

[     1.000746] nvme0 at pci8 dev 0 function 0: vendor 144d product a808 (rev. 0x00)
[     1.000746] nvme0: NVMe 1.3
[     1.000746] nvme0: for admin queue interrupting at msix3 vec 0
[     1.000746] nvme0: Samsung SSD 970 EVO Plus 1TB, firmware 1B2QEXM7, serial S4EWNF0M404219L
[     1.000746] nvme0: for io queue 1 interrupting at msix3 vec 1 affinity to cpu0
[     1.000746] nvme0: for io queue 2 interrupting at msix3 vec 2 affinity to cpu1
[     1.000746] nvme0: for io queue 3 interrupting at msix3 vec 3 affinity to cpu2
[     1.000746] nvme0: for io queue 4 interrupting at msix3 vec 4 affinity to cpu3
[     1.000746] nvme0: for io queue 5 interrupting at msix3 vec 5 affinity to cpu4
[     1.000746] nvme0: for io queue 6 interrupting at msix3 vec 6 affinity to cpu5
[     1.000746] nvme0: for io queue 7 interrupting at msix3 vec 7 affinity to cpu6
[     1.000746] nvme0: for io queue 8 interrupting at msix3 vec 8 affinity to cpu7
[     1.000746] nvme0: for io queue 9 interrupting at msix3 vec 9 affinity to cpu8
[     1.000746] nvme0: for io queue 10 interrupting at msix3 vec 10 affinity to cpu9
[     1.000746] nvme0: for io queue 11 interrupting at msix3 vec 11 affinity to cpu10
[     1.000746] nvme0: for io queue 12 interrupting at msix3 vec 12 affinity to cpu11
[     1.000746] nvme0: for io queue 13 interrupting at msix3 vec 13 affinity to cpu12
[     1.000746] nvme0: for io queue 14 interrupting at msix3 vec 14 affinity to cpu13
[     1.000746] nvme0: for io queue 15 interrupting at msix3 vec 15 affinity to cpu14
[     1.000746] nvme0: for io queue 16 interrupting at msix3 vec 16 affinity to cpu15
[     1.000746] nvme0: for io queue 17 interrupting at msix3 vec 17 affinity to cpu16
[     1.000746] nvme0: for io queue 18 interrupting at msix3 vec 18 affinity to cpu17
[     1.000746] nvme0: for io queue 19 interrupting at msix3 vec 19 affinity to cpu18
[     1.000746] nvme0: for io queue 20 interrupting at msix3 vec 20 affinity to cpu19
[     1.000746] nvme0: for io queue 21 interrupting at msix3 vec 21 affinity to cpu20
[     1.000746] nvme0: for io queue 22 interrupting at msix3 vec 22 affinity to cpu21
[     1.000746] nvme0: for io queue 23 interrupting at msix3 vec 23 affinity to cpu22
[     1.000746] nvme0: for io queue 24 interrupting at msix3 vec 24 affinity to cpu23
[     1.000746] nvme0: for io queue 25 interrupting at msix3 vec 25 affinity to cpu24
[     1.000746] nvme0: for io queue 26 interrupting at msix3 vec 26 affinity to cpu25
[     1.000746] nvme0: for io queue 27 interrupting at msix3 vec 27 affinity to cpu26
[     1.000746] nvme0: for io queue 28 interrupting at msix3 vec 28 affinity to cpu27
[     1.000746] nvme0: for io queue 29 interrupting at msix3 vec 29 affinity to cpu28
[     1.000746] nvme0: for io queue 30 interrupting at msix3 vec 30 affinity to cpu29
[     1.000746] nvme0: for io queue 31 interrupting at msix3 vec 31 affinity to cpu30
[     1.000746] nvme0: for io queue 32 interrupting at msix3 vec 32 affinity to cpu31
[     1.000746] ld0 at nvme0 nsid 1

>How-To-Repeat:
Boot the system, try pressing a key, look at the logs.
>Fix:
It has been suggested that this is a bug in the MSI(X) code.

>Unformatted:
 	
 	


Home | Main Index | Thread Index | Old Index