Hi, iwm driver has one watchdog timer for the aggregate Tx queue. And it resets the interface when the watchdog fires resulting in loss of connectivity. The workaround for this issue is executing ifconfig iwm0 down ifconfig iwm0 up Here is the dump of logs when this reset happens: [ 185.741855] iwm0: autoconfiguration error: device timeout [ 185.741855] iwm0: autoconfiguration error: dumping device error log [ 185.741855] iwm0: autoconfiguration error: Start Error Log Dump: [ 185.741855] iwm0: autoconfiguration error: Status: 0x63, count: 6 [ 185.741855] iwm0: autoconfiguration error: 000022CE | ADVANCED_SYSASSERT [ 185.741855] iwm0: autoconfiguration error: 000002B3 | trm_hw_status0 [ 185.741855] iwm0: autoconfiguration error: 00000000 | trm_hw_status1 [ 185.741855] iwm0: autoconfiguration error: 0000E258 | branchlink2 [ 185.741855] iwm0: autoconfiguration error: 0002730C | interruptlink1 [ 185.741855] iwm0: autoconfiguration error: 00000000 | interruptlink2 [ 185.741855] iwm0: autoconfiguration error: 0000001C | data1 [ 185.741855] iwm0: autoconfiguration error: 03230000 | data2 [ 185.741855] iwm0: autoconfiguration error: DEADBEEF | data3 [ 185.741855] iwm0: autoconfiguration error: 9D401E06 | beacon time [ 185.741855] iwm0: autoconfiguration error: 272AA1FD | tsf low [ 185.741855] iwm0: autoconfiguration error: 0000002F | tsf hi [ 185.741855] iwm0: autoconfiguration error: 00000000 | time gp1 [ 185.741855] iwm0: autoconfiguration error: 041C1813 | time gp2 [ 185.741855] iwm0: autoconfiguration error: 00000000 | uCode revision type [ 185.741855] iwm0: autoconfiguration error: 00000016 | uCode version major [ 185.741855] iwm0: autoconfiguration error: 00058404 | uCode version minor [ 185.741855] iwm0: autoconfiguration error: 00000230 | hw version [ 185.741855] iwm0: autoconfiguration error: 00009000 | board version [ 185.741855] iwm0: autoconfiguration error: 0000001C | hcmd [ 185.741855] iwm0: autoconfiguration error: 22F12000 | isr0 [ 185.741855] iwm0: autoconfiguration error: 00004000 | isr1 [ 185.741855] iwm0: autoconfiguration error: 00001802 | isr2 [ 185.741855] iwm0: autoconfiguration error: 40417DC1 | isr3 [ 185.741855] iwm0: autoconfiguration error: 00000000 | isr4 [ 185.741855] iwm0: autoconfiguration error: 10900112 | last cmd Id [ 185.741855] iwm0: autoconfiguration error: 00000000 | wait_event [ 185.741855] iwm0: autoconfiguration error: 00000080 | l2p_control [ 185.741855] iwm0: autoconfiguration error: 00011C22 | l2p_duration [ 185.741855] iwm0: autoconfiguration error: 0000003F | l2p_mhvalid [ 185.741855] iwm0: autoconfiguration error: 000000CE | l2p_addr_match [ 185.741855] iwm0: autoconfiguration error: 0000000D | lmpm_pmg_sel [ 185.741855] iwm0: autoconfiguration error: 03071928 | timestamp [ 185.741855] iwm0: autoconfiguration error: 15E4A0A0 | flow_handler [ 185.741855] iwm0: autoconfiguration error: Start UMAC Error Log Dump: [ 185.741855] iwm0: autoconfiguration error: Status: 0x63, count: 7 [ 185.741855] iwm0: autoconfiguration error: 0x00000070 | ADVANCED_SYSASSERT [ 185.741855] iwm0: autoconfiguration error: 0x00000000 | umac branchlink1 [ 185.741855] iwm0: autoconfiguration error: 0xC0082F64 | umac branchlink2 [ 185.741855] iwm0: autoconfiguration error: 0xC0081000 | umac interruptlink1 [ 185.741855] iwm0: autoconfiguration error: 0xC0081000 | umac interruptlink2 [ 185.741855] iwm0: autoconfiguration error: 0x00000800 | umac data1 [ 185.741855] iwm0: autoconfiguration error: 0xC0081000 | umac data2 [ 185.741855] iwm0: autoconfiguration error: 0xDEADBEEF | umac data3 [ 185.741855] iwm0: autoconfiguration error: 0x00000016 | umac major [ 185.741855] iwm0: autoconfiguration error: 0x00058404 | umac minor [ 185.741855] iwm0: autoconfiguration error: 0xC0886280 | frame pointer [ 185.741855] iwm0: autoconfiguration error: 0xC0886280 | stack pointer [ 185.741855] iwm0: autoconfiguration error: 0x092B002C | last host cmd [ 185.741855] iwm0: autoconfiguration error: 0x00000000 | isr status reg The status register value 0x63 does not seem to indicate anything is wrong with the device. Linux and OpenBSD handle this issue by creating a timer per Tx queue, and the reasoning is just because some Tx queues get stuck sometimes does not mean we should reset the interface. I don't know whether those stuck queues ever get unblocked. But, with this patch, I am not experiencing any loss of WiFi. I modeled this patch on Stefan Sperling's patch in OpenBSD for the same issue: https://www.mail-archive.com/tech%openbsd.org@localhost/msg66949.html Please review. Best, Salil
Attachment:
iwm-patch.diff
Description: Binary data