NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57965: urtwn(4) command queue can overflow and deadlock



The following reply was made to PR kern/57965; it has been noted by GNATS.

From: Ramiro Aceves <ea1abz%gmail.com@localhost>
To: gnats-bugs%netbsd.org@localhost, Taylor R Campbell <riastradh%netbsd.org@localhost>
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/57965: urtwn(4) command queue can overflow and deadlock
Date: Thu, 29 Feb 2024 13:05:09 +0100

 El lun, 26 feb 2024 a las 20:50, <campbell+netbsd%mumble.net@localhost> escribi=C3=B3=
 :
 >
 > >Number:         57965
 > >Category:       kern
 > >Synopsis:       urtwn(4) command queue can overflow and deadlock
 > >Confidential:   no
 > >Severity:       serious
 > >Priority:       medium
 > >Responsible:    kern-bug-people
 > >State:          open
 > >Class:          sw-bug
 > >Submitter-Id:   net
 > >Arrival-Date:   Mon Feb 26 19:50:00 +0000 2024
 > >Originator:     Taylor R Campbell
 > >Release:        current, 10, 9, 8
 > >Organization:
 > The NetURTWN Foundation
 > >Environment:
 > >Description:
 > urtwn_do_async does nothing to prevent the command queue from overflowing=
 , which leads to ring->queued exceeding the number that the task will ever =
 decrement back to zero:
 >
 >         mutex_spin_enter(&sc->sc_task_mtx);
 >         cmd =3D &ring->cmd[ring->cur];
 >         cmd->cb =3D cb;
 >         KASSERT(len <=3D sizeof(cmd->data));
 >         memcpy(cmd->data, arg, len);
 >         ring->cur =3D (ring->cur + 1) % URTWN_HOST_CMD_RING_COUNT;
 >
 >         /* If there is no pending command already, schedule a task. */
 >         if (!sc->sc_dying && ++ring->queued =3D=3D 1) {
 >                 mutex_spin_exit(&sc->sc_task_mtx);
 >                 usb_add_task(sc->sc_udev, &sc->sc_task, USB_TASKQ_DRIVER)=
 ;
 >         } else
 >                 mutex_spin_exit(&sc->sc_task_mtx);
 >         splx(s);
 >
 > When this happens, urtwn_wait_async hangs forever, so you can't bring the=
  interface down or do anything else with the interface once you make the mi=
 stake of trying.
 > >How-To-Repeat:
 > exist in a wifi environment that provokes a bunch of state changes (and w=
 hatever else) faster than the kernel can process them
 > >Fix:
 > (a) stop-gap measure: if ring->queued is already URTWN_HOST_CMD_RING_COUN=
 T, drop the oldest queued command and don't increment ring->queued (but do =
 print a message about overflow)
 >
 > (b) real fix: stop using a command ring and just use three separate tasks=
 , since two commands of the same type are redundant and can be compressed i=
 nto one; that way there's no overflow condition
 >
 
 Hello Taylor,
 
 I do not know if this could be related to the problem you describe
 with urtwn driver but just in case it is useful I tell you my
 findings.
 My RaspberryPi Zero W is using now the urtwn driver with a USB WIFI
 adaptor because of bwfm driver does not function well. I made a
 service that checks www.google.com and reboots the system if there is
 no ping,  checking 3 times during  5 min intervals.
 
 With urtwn, network worked fine but router automatic changing channels
 leaded to complete network loss, needing reboot.
 
 I then changed the router settings to fixed 11th channel and the
 system was running very fine for 12 days, but I let the rebooting
 script for recovering the network if everything failed.
 
 Tonight and after 12 days and 17:42 hours, the script restarted the system:
 
 
 Thu Feb 29 01:32:11 CET 2024
  1:32AM  up 12 days, 17:42, 0 users, load averages: 0.05, 0.01, 0.00
 
 After running 46 minutes, restarted again:
 
 reboot at:
 Thu Feb 29 02:18:00 CET 2024
  2:18AM  up 46 mins, 0 users, load averages: 0.00, 0.00, 0.00
 
 After running 2:01 hours, restarted again:
 
 reboot at:
 Thu Feb 29 04:19:08 CET 2024
  4:19AM  up  2:01, 0 users, load averages: 0.99, 0.56, 0.25
 
 
 After running 1:01 hours, restarted again:
 
 Thu Feb 29 05:20:08 CET 2024
  5:20AM  up  1:01, 0 users, load averages: 0.08, 0.05, 0.02
 
 Now it is running fine for 7 hours:
 
 netbsd-raspa# uptime
 12:55PM  up  7:35, 2 users, load averages: 0.00, 0.00, 0.00
 
 
 
 netbsd-raspa# uname -a
 NetBSD netbsd-raspa 10.0_RC3 NetBSD 10.0_RC3 (RPI) #0: Tue Jan 16
 08:28:51 UTC 2024
 mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile/RPI evbarm
 netbsd-raspa#
 
 There is nothing useful in NetBSD  /var/log/messages and /var
 directory, but I have a Debian GNU/Linux Raspberry Zero W also running
 and recorded this interesting wpa_supplicant events and the times of
 the automatic script reboots:
 
 
 feb 24 03:10:03 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCORE type=3DWORLD
 feb 24 03:10:03 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DUSER type=3DCOUNTRY alpha2=3DGB
 feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCOUNTRY_IE type=3DCOUNTRY alpha2=3DES
 feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-DISCONNECTED bssid=3D60:8d:26:32:34:24 reason=3D0
 locally_generated=3D1
 feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCORE type=3DWORLD
 feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DUSER type=3DCOUNTRY alpha2=3DGB
 feb 29 01:16:18 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:23 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:29 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:35 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:40 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:46 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:52 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:16:57 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:03 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:09 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:14 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:20 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:25 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:31 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:37 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:42 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:48 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:54 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:17:59 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:05 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:10 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:16 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:22 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:27 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:33 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:39 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:44 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:50 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:18:55 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:01 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:07 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:12 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:18 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:24 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:29 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:35 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
 to initiate sched scan
 feb 29 01:19:41 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
 to associate with SSID 'MiFibra-3422'
 feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCOUNTRY_IE type=3DCOUNTRY alpha2=3DES
 feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-DISCONNECTED bssid=3D60:8d:26:32:34:24 reason=3D7
 feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
 to associate with SSID 'MiFibra-3422'
 feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCORE type=3DWORLD
 feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DUSER type=3DCOUNTRY alpha2=3DGB
 feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCOUNTRY_IE type=3DCOUNTRY alpha2=3DES
 feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-DISCONNECTED bssid=3D60:8d:26:32:34:24 reason=3D0
 locally_generated=3D1
 feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCORE type=3DWORLD
 feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DUSER type=3DCOUNTRY alpha2=3DGB
 feb 29 05:00:15 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
 to associate with SSID 'MiFibra-3422'
 feb 29 05:00:15 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-ASSOC-REJECT bssid=3D00:00:00:00:00:00 status_code=3D16
 feb 29 05:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
 to associate with SSID 'MiFibra-3422'
 feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 Associated with 60:8d:26:32:34:24
 feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=3D0
 id_str=3D]
 feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-SUBNET-STATUS-UPDATE status=3D0
 feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
 CTRL-EVENT-REGDOM-CHANGE init=3DCOUNTRY_IE type=3DCOUNTRY alpha2=3DES
 
 Hope this could be useful.
 
 Regards.
 Ramiro.
 


Home | Main Index | Thread Index | Old Index