NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/54532: EPYC 7401P panics with page fault trap, code=0



>Number:         54532
>Category:       kern
>Synopsis:       EPYC 7401P panics with page fault trap, code=0
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Sep 08 01:10:01 +0000 2019
>Originator:     Alexander Nasonov
>Release:        NetBSD 8.1
>Organization:
	XMM SWAP LTD
>Environment:
NetBSD 8.1 (GENERIC) #0: Fri May 31 08:43:59 UTC 2019
	mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/compile/GENERIC
Architecture: x86_64
Machine: amd64
>Description:
I tried installing amd64 8.1, 9.0_BETA and -current on Scaleway's bare metal
dedibox (AMD EPYC 7401P, 24 cores, 256G RAM, 3 NVMe disks) but they all
paniced in a similar way.

Output below is typed and it may have typos. I recorded a video and I can
send missing pieces or check for typos.

NetBSD/x86 EFI Boot (x64), Revision 1.0 (Fri May 31 08:43:59 UTC 2019) (from
NetBSD 8.1)
>> Memory: 640/1946304 k

...

NetBSD 8.1 (GENERIC) #0: Fri May 31 08:43:59 UTC 2019
	mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/compile/GENERIC
total memory = 255 GB
avail memory = 246 GB
cpu_rnd: RDSEED
running cgd selftest aes-xts-256 aes-xts-512 dsone
efi: systbl at pa dab5d018
mainbus0 (root)
...
... ACPI table is available on request
...
ioapic0 at mainbus0 apid 128
ioapic0: can't remap to apid 128
ioapic1 at mainbus0 apid 129
ioapic1: can't remap to apid 129
ioapic2 at mainbus0 apid 130
ioapic2: can't remap to apid 130
ioapic3 at mainbus0 apid 131
ioapic3: can't remap to apid 131
ioapic4 at mainbus0 apid 132
ioapic4: can't remap to apid 132
cpu0 at mainbus0 apid 0
cpu0: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu0: package 0, core 0, smt 0
cpu1 at mainbus0 apid 2
cpu1: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu1: package 0, core 1, smt 0
cpu2 at mainbus0 apid 4
cpu2: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu2: package 0, core 2, smt 0
cpu3 at mainbus0 apid 8
cpu3: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu3: package 0, core 4, smt 0
cpu4 at mainbus0 apid 10
cpu4: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu4: package 0, core 5, smt 0
cpu5 at mainbus0 apid 12
cpu5: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu5: package 0, core 6, smt 0
cpu6 at mainbus0 apid 16
cpu6: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu6: package 0, core 8, smt 0
...
... more cpu info is available on request
...
cpu47 at mainbus0 apid 61
cpu47: AMD EPYC 7401P 24-Core Processor		, id 0x800f12
cpu47: package 0, core 30, smt 1
acpi0 at mainbus0: Intel ACPICA 20170303
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
AMDN (PNP0C01) at acpi0 not configured
attimer1 at acpi0 (TMR, PN0100): io 0x40-0x43 irq 0
pcppi1 at acpi0 (SPKR, PN0800): io 0x61
spkr0 at pcppi1: PC Speaker
midi0 at pcppi1: PC Speaker
sysbeep0 at pcppi1
SI01 (PNP0C02) at acpi0 not configured
UAR2 (PNP0501) at acpi0 not configured
SPM1 (IPI0001) at acpi0 not configured
GPIO (AMDI0030) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
ACPI: Enabled 1 GPEs in block 00 to 1F
attimer1: attached to pcppi1
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
amdsmn0 at pci0 dev 0 function 0: AMD Family 17h System Management Network
amdzentemp0 at amdsmn0: AMD Cpu Temperature Sensor (Family17h)
vendor 1022 product 1451 (IOMMU system) at pci0 dev 0 function 2 not configured
pchb0 at pci0 dev 1 function 0: vendor 1022 product 1452 (rev. 0x00)
ppb0 at pci0 dev 1 function 1: vendor 1022 product 1453 (rev. 0x00)
ppb0: PCI Express capability version 2 <Root Port of PCI-E Root Complex> x8 @ 8.0GT/s
ppb0: link is x8 @ 5.0GT/s
pci1 at ppb0 bus1
ixg0 at pci1 dev 0 function 0: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 4.0.1-k
ixg0: clearing prefetchable bit
ixg0: device X550
ixg0: NVM Image Version 1.55, PHY FW Revision 2.0b ID 0x9, NVM Map version 1.55, ETrackID 800007a4
ixg0: for TX/RX, interrupting at msix0 vec 0, bound queue 0 to cpu 0
ixg0: for TX/RX, interrupting at msix0 vec 1, bound queue 1 to cpu 1
ixg0: for TX/RX, interrupting at msix0 vec 2, bound queue 2 to cpu 2
ixg0: for TX/RX, interrupting at msix0 vec 3, bound queue 3 to cpu 3
ixg0: for TX/RX, interrupting at msix0 vec 4, bound queue 4 to cpu 4
ixg0: for TX/RX, interrupting at msix0 vec 5, bound queue 5 to cpu 5
ixg0: for TX/RX, interrupting at msix0 vec 6, bound queue 6 to cpu 6
ixg0: for TX/RX, interrupting at msix0 vec 7, bound queue 7 to cpu 7
ixg0: for TX/RX, interrupting at msix0 vec 8, bound queue 8 to cpu 8
ixg0: for TX/RX, interrupting at msix0 vec 9, bound queue 9 to cpu 9
ixg0: for TX/RX, interrupting at msix0 vec 10, bound queue 10 to cpu 10
ixg0: for TX/RX, interrupting at msix0 vec 11, bound queue 11 to cpu 11
ixg0: for TX/RX, interrupting at msix0 vec 12, bound queue 12 to cpu 12
ixg0: for TX/RX, interrupting at msix0 vec 13, bound queue 13 to cpu 13
ixg0: for TX/RX, interrupting at msix0 vec 14, bound queue 14 to cpu 14
ixg0: for TX/RX, interrupting at msix0 vec 15, bound queue 15 to cpu 15
ixg0: for TX/RX, interrupting at msix0 vec 16, bound queue 16 to cpu 16
ixg0: for TX/RX, interrupting at msix0 vec 17, bound queue 17 to cpu 17
ixg0: for TX/RX, interrupting at msix0 vec 18, bound queue 18 to cpu 18
ixg0: for TX/RX, interrupting at msix0 vec 19, bound queue 19 to cpu 19
ixg0: for TX/RX, interrupting at msix0 vec 20, bound queue 20 to cpu 20
ixg0: for TX/RX, interrupting at msix0 vec 21, bound queue 21 to cpu 21
ixg0: for TX/RX, interrupting at msix0 vec 22, bound queue 22 to cpu 22
ixg0: for TX/RX, interrupting at msix0 vec 23, bound queue 23 to cpu 23
ixg0: for TX/RX, interrupting at msix0 vec 24, bound queue 24 to cpu 24
ixg0: for TX/RX, interrupting at msix0 vec 25, bound queue 25 to cpu 25
ixg0: for TX/RX, interrupting at msix0 vec 26, bound queue 26 to cpu 26
ixg0: for TX/RX, interrupting at msix0 vec 27, bound queue 27 to cpu 27
ixg0: for TX/RX, interrupting at msix0 vec 28, bound queue 28 to cpu 28
ixg0: for TX/RX, interrupting at msix0 vec 29, bound queue 29 to cpu 29
ixg0: for TX/RX, interrupting at msix0 vec 30, bound queue 30 to cpu 30
ixg0: for TX/RX, interrupting at msix0 vec 31, bound queue 31 to cpu 31
ixg0: for TX/RX, interrupting at msix0 vec 32, bound queue 32 to cpu 32
ixg0: for TX/RX, interrupting at msix0 vec 33, bound queue 33 to cpu 33
ixg0: for TX/RX, interrupting at msix0 vec 34, bound queue 34 to cpu 34
ixg0: for TX/RX, interrupting at msix0 vec 35, bound queue 35 to cpu 35
ixg0: for TX/RX, interrupting at msix0 vec 36, bound queue 36 to cpu 36
ixg0: for TX/RX, interrupting at msix0 vec 37, bound queue 37 to cpu 37
ixg0: for TX/RX, interrupting at msix0 vec 38, bound queue 38 to cpu 38
ixg0: for TX/RX, interrupting at msix0 vec 39, bound queue 39 to cpu 39
ixg0: for TX/RX, interrupting at msix0 vec 40, bound queue 40 to cpu 40
ixg0: for TX/RX, interrupting at msix0 vec 41, bound queue 41 to cpu 41
ixg0: for TX/RX, interrupting at msix0 vec 42, bound queue 42 to cpu 42
ixg0: for TX/RX, interrupting at msix0 vec 43, bound queue 43 to cpu 43
ixg0: for TX/RX, interrupting at msix0 vec 44, bound queue 44 to cpu 44
ixg0: for TX/RX, interrupting at msix0 vec 45, bound queue 45 to cpu 45
ixg0: for TX/RX, interrupting at msix0 vec 46, bound queue 46 to cpu 46
ixg0: for TX/RX, interrupting at msix0 vec 47, bound queue 47 to cpu 47
ixg0: for link, interrupting at msix0 vec 48, affinity to cpu0
ixg0: Using MSI-X interrupts with 49 vectors
ixg0: PHY OUI 0x00aa00, model 0x0022, rev. 0
ixg0: PCI Express Bus: Speed 5.0GT/s Width x8
ixg1 at pci1 dev 0 function 1: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 4.0.1-k
...
... More availalbe upon request
...
pchb1 at pci0 dev 2 function o: vendor 1022 product 1452 (rev. 0x00)
pchb2 at pci0 dev 3 function o: vendor 1022 product 1452 (rev. 0x00)
pchb3 at pci0 dev 4 function o: vendor 1022 product 1452 (rev. 0x00)
pchb4 at pci0 dev 7 function o: vendor 1022 product 1452 (rev. 0x00)
ppb1 at pci0 dev 7 function 1: vendor 1022 product 1454 (rev. 0x00)
ppb1: PRI Express capability version 2 <Root Port of PCI-E root Complex> x16 @ 8.0GT/s
pci2 ar ppb1 bus 3
vendor 1022 product 145a (non essential instrumentation, subclass 0x00) at pci2 dev 0 function 0 not configured
vendor 1022 product 145a (non essential instrumentation, subclass 0x00) at pci5 dev 0 function 0 not configured
vendor 1022 product 1456 (miscellaneous crypto) at pci5 dev 0 function 2 not configured
xhci1 at pci5 dev 0 function 3: vendor 1022 product 145f (rev. 0x00)
xhci1: interrupting at msi3 vec 0
usb2 at xhci1: USB revision 3.0
usb3 at xhci1: USB revision 2.0
pchb43 at pci4 dev 8 function 0: vendor 1022 product 1452 (rev. 0x00)
ppb4 at pci4 dev 8 function 1: vendor 1022 product 1454 (rev. 0x00)
ppb4: PRI Express capability version 2 <Root Port of PCI-E Root Complex> x16 @ 8.0GT/s
pci6 at ppb4 bus 34
vendor 1022 product 1454 (non-essential instrumentation, subclass 0x00) at pci6 dev 0 function 0 not configured
vendor 1022 product 1458 (miscellaneous crypto) at pci6 dev 0 function 1 not configured
pci7 at mainbus0 bus 64
amdsmn2 at pci7 dev 0 function 0: AMD Family 17h System Management Network
amdzentemp2 at amdsmn2: AMD CPU Temperature Snesors (Family17h)
vendor 1022 product 1451 (IOMMU system) at pci7 dev 0 function 2 not configured
pchb44 at pci7 dev 1 function o: vendor 1o22 product 1452 (rev. 0x00)
pchb45 at pci7 dev 2 function o: vendor 1o22 product 1452 (rev. 0x00)
pchb46 at pci7 dev 3 function o: vendor 1o22 product 1452 (rev. 0x00)
ppb5 at pci7 dev 3 function 1: vendor 1o22 product 1453 (rev. 0x00)
ppb5: PRIc Express capability version 2 <Root Port of PCI-E Root Complex> x4 @ 8.0GT/s
pci8 at ppb5 bus 65
nvme0 at pci8 dev 0 function o: vendor 144d product a808 (rev. 0x00)
nvme0: NVMe 1.2
nvme0: NVMe 1.2
nvme0: for admin queue interrupting at msix4 vec 0
nvme0: SAMSUNG MZVLB1T0HALR-00000, firmware EXA7201Q, serial .....
nvme0: for io queue 1 interrupting at misx4 vec 4 affinity to cpu0
nvme0: for io queue 2 interrupting at misx4 vec 5 affinity to cpu1
nvme0: for io queue 3 interrupting at misx4 vec 6 affinity to cpu2
nvme0: for io queue 4 interrupting at misx4 vec 7 affinity to cpu3
nvme0: for io queue 5 interrupting at misx4 vec 8 affinity to cpu4
nvme0: for io queue 6 interrupting at misx4 vec 9 affinity to cpu5
nvme0: for io queue 7 interrupting at misx4 vec 10 affinity to cpu6
nvme0: for io queue 8 interrupting at misx4 vec 11 affinity to cpu7
nvme0: for io queue 9 interrupting at misx4 vec 12 affinity to cpu8
nvme0: for io queue 10 interrupting at misx4 vec 13 affinity to cpu9
nvme0: for io queue 11 interrupting at misx4 vec 14 affinity to cpu10
nvme0: for io queue 12 interrupting at misx4 vec 15 affinity to cpu11
nvme0: for io queue 13 interrupting at misx4 vec 16 affinity to cpu12
nvme0: for io queue 14 interrupting at misx4 vec 17 affinity to cpu13
nvme0: for io queue 15 interrupting at misx4 vec 18 affinity to cpu14
nvme0: for io queue 16 interrupting at misx4 vec 19 affinity to cpu15
nvme0: for io queue 17 interrupting at misx4 vec 20 affinity to cpu16
nvme0: for io queue 18 interrupting at misx4 vec 21 affinity to cpu17
nvme0: for io queue 19 interrupting at misx4 vec 22 affinity to cpu18
nvme0: for io queue 20 interrupting at misx4 vec 23 affinity to cpu19
nvme0: for io queue 21 interrupting at misx4 vec 24 affinity to cpu20
nvme0: for io queue 22 interrupting at misx4 vec 25 affinity to cpu21
nvme0: for io queue 23 interrupting at misx4 vec 26 affinity to cpu22
nvme0: for io queue 24 interrupting at misx4 vec 27 affinity to cpu23
nvme0: for io queue 25 interrupting at misx4 vec 28 affinity to cpu24
nvme0: for io queue 26 interrupting at misx4 vec 29 affinity to cpu25
nvme0: for io queue 27 interrupting at misx4 vec 30 affinity to cpu26
nvme0: for io queue 28 interrupting at misx4 vec 31 affinity to cpu27
nvme0: for io queue 29 interrupting at misx4 vec 32 affinity to cpu28
nvme0: for io queue 30 interrupting at misx4 vec 33 affinity to cpu29
nvme0: for io queue 31 interrupting at misx4 vec 34 affinity to cpu30
uvm_fault(0xffffffff815b1a00, 0x0, 4) -> e
fatal page fault in supervisor mode
trap type 6 code 0x10 rip 0 cs 0x8 rflags 0x10202 cr2 0 ilevel 0x8 rsp 0xffffffff817fb3a8
curlwp 0xffffffff81482060 pid 0.1 lowest kstack 0xffffffff817f82c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at 0:uvm_fault(0xffffffff815b1a00, 0x7fbfc0000000, 1) -> e
page fault in supervisor mode
trap type 6 code 0 rip 0xffffffff802207c7 cs 0x8 rflags 0x10216 cr2 0x7fbfc0000000 ilevel 0x8 rsp 0xffffffff817fafc0
curlwp 0xffffffff81482060 pid 0.1 lowest kstack 0xffffffff817f82c0
	kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at netbsd:db_disasm+0x65:	testb	$0x1,0(%rcx,%rcx
,0)
db{0}>
>How-To-Repeat:
Buy HM-BM1-M from Scaleway, download boot.iso from cdn.NetBSD.org,
mount it from KVM console and boot it. 
>Fix:
Not known.



Home | Main Index | Thread Index | Old Index