NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/55507: sometimes hdaudio panics on attach, possible memory corruption

>Number:         55507
>Category:       kern
>Synopsis:       sometimes hdaudio panics on attach, possible memory corruption
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jul 21 12:45:00 +0000 2020
>Originator:     Reinoud Zandijk
>Release:        NetBSD/amd64 9.99.68
NetBSD 9.99.68 NetBSD 9.99.68 (GENERIC) #0: Mon Jun 22 20:15:25 UTC 2020 amd64

While investigating PR#51734 since that triggers at times on this machine, i rebooted the machine a couple of times. Most of the times it boots fine, at times it gives the RIRB storm as in PR#51734 but i also at times get this panic:

(transcribed from screen)

Stopped in pid 0.0 (system) at netbsd:breakpoint+0.5 (leave)
breakpoint() at netbsd:breakpoint+0x05
vpanic() at netbsd:vpanic+0x152
__x86_indirect_thunk_rax() at netbsd:__x86_indirect_thunk_rax
kmem_intr_alloc() at netbsd:kmem_intr_alloc+0xef
kmem_intr_zalloc() at netbsd:kmem_intr_zalloc+0x11
kmem_zalloc() at netbsd:kmem_zalloc+0x4a
hdaudio_attach() at netbsd:hdaudio_attach+0x65f
hdaudio_pci_attach() at netbsd:hdaudio_pci_attach+0x1b3
config_attach_loc() at netbsd:config_attach_loc+0x182
pci_probe_device() at netbsd:pci_probe_device+0x574
pci_enummerate_bus() at netbsd:pci_enumerate_bus+0x1b7
pcirescan() at netbsd:prirescan+0x4e
pciattach() at netbsd:pciattach+0x186
config_attach_loc() at netbsd:config_attach_loc+0x182
mp_pci_scan() at netbsd:mp_pci_scan+0xa4
amd64_mainbus_attach() at netbsd:amd64_mainbus_attach+0x237
mainbus_attach() at netbsd:mainbus_attach+0x83
config_attach_loc() at netbsd:config_attach_loc+0x182
cpu_configure() at netbsd:cpu_configure+0x38
main() at netbsd:main+0x2ec

As far as i could see, it happened on hdaudio0 as are the RIRB timeouts.

The relevant parts from the dmesg show on a regular boot:

[     1.001441] pci0 at mainbus0 bus 0: configuration mode 1
[     1.001441] pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
[     1.001441] pchb0 at pci0 dev 0 function 0: Intel Core 4G (mobile) Host Bridge, DRAM (rev. 0x0b)
[     1.001441] i915drmkms0 at pci0 dev 2 function 0: Intel HD Graphics (GT1) (rev. 0x0b)
[     1.001441] hdaudio0 at pci0 dev 3 function 0: HD Audio Controller
[     1.001441] hdaudio0: interrupting at msi0 vec 0
[     1.001441] hdaudio0: HDA ver. 1.0, OSS 2, ISS 0, BSS 0, SDO 1, 64-bit
[     1.001441] hdaudio0: autoconfiguration error: RIRB timeout
[     1.001441] hdaudio0: autoconfiguration error: RIRB timeout
[     1.001441] xhci0 at pci0 dev 20 function 0: Intel Core 4G (mobile) USB xHCI (rev. 0x04)
[     1.001441] hdaudio1 at pci0 dev 27 function 0: HD Audio Controller
[     1.001441] hdaudio1: interrupting at msi2 vec 0
[     1.001441] hdaudio1: HDA ver. 1.0, OSS 4, ISS 4, BSS 0, SDO 1, 64-bit
[     1.001441] hdafg0 at hdaudio1: vendor 10ec product 0270

It was suggested to me that it might be a memory corruption before that triggers it. Is some memory not zeroed that ought to be?

Note that my monitor is connected trough a HDMI to DVI-D converter cable and thus has no audio output. All audio trough audio0 @ hdafg0 goes well.

Reboot the machine a couple of times in a row. It will be OK most of the times but at rare occasions it gets this panic. I am not sure if its specific to this machine.

Power cycle the machine. It will ask for a prompt on dumping the panic but the keyboard is not working so no luck rebooting it remotely on default GENERIC.

Home | Main Index | Thread Index | Old Index