NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57965: urtwn(4) command queue can overflow and deadlock



El lun, 26 feb 2024 a las 20:50, <campbell+netbsd%mumble.net@localhost> escribió:
>
> >Number:         57965
> >Category:       kern
> >Synopsis:       urtwn(4) command queue can overflow and deadlock
> >Confidential:   no
> >Severity:       serious
> >Priority:       medium
> >Responsible:    kern-bug-people
> >State:          open
> >Class:          sw-bug
> >Submitter-Id:   net
> >Arrival-Date:   Mon Feb 26 19:50:00 +0000 2024
> >Originator:     Taylor R Campbell
> >Release:        current, 10, 9, 8
> >Organization:
> The NetURTWN Foundation
> >Environment:
> >Description:
> urtwn_do_async does nothing to prevent the command queue from overflowing, which leads to ring->queued exceeding the number that the task will ever decrement back to zero:
>
>         mutex_spin_enter(&sc->sc_task_mtx);
>         cmd = &ring->cmd[ring->cur];
>         cmd->cb = cb;
>         KASSERT(len <= sizeof(cmd->data));
>         memcpy(cmd->data, arg, len);
>         ring->cur = (ring->cur + 1) % URTWN_HOST_CMD_RING_COUNT;
>
>         /* If there is no pending command already, schedule a task. */
>         if (!sc->sc_dying && ++ring->queued == 1) {
>                 mutex_spin_exit(&sc->sc_task_mtx);
>                 usb_add_task(sc->sc_udev, &sc->sc_task, USB_TASKQ_DRIVER);
>         } else
>                 mutex_spin_exit(&sc->sc_task_mtx);
>         splx(s);
>
> When this happens, urtwn_wait_async hangs forever, so you can't bring the interface down or do anything else with the interface once you make the mistake of trying.
> >How-To-Repeat:
> exist in a wifi environment that provokes a bunch of state changes (and whatever else) faster than the kernel can process them
> >Fix:
> (a) stop-gap measure: if ring->queued is already URTWN_HOST_CMD_RING_COUNT, drop the oldest queued command and don't increment ring->queued (but do print a message about overflow)
>
> (b) real fix: stop using a command ring and just use three separate tasks, since two commands of the same type are redundant and can be compressed into one; that way there's no overflow condition
>

Hello Taylor,

I do not know if this could be related to the problem you describe
with urtwn driver but just in case it is useful I tell you my
findings.
My RaspberryPi Zero W is using now the urtwn driver with a USB WIFI
adaptor because of bwfm driver does not function well. I made a
service that checks www.google.com and reboots the system if there is
no ping,  checking 3 times during  5 min intervals.

With urtwn, network worked fine but router automatic changing channels
leaded to complete network loss, needing reboot.

I then changed the router settings to fixed 11th channel and the
system was running very fine for 12 days, but I let the rebooting
script for recovering the network if everything failed.

Tonight and after 12 days and 17:42 hours, the script restarted the system:


Thu Feb 29 01:32:11 CET 2024
 1:32AM  up 12 days, 17:42, 0 users, load averages: 0.05, 0.01, 0.00

After running 46 minutes, restarted again:

reboot at:
Thu Feb 29 02:18:00 CET 2024
 2:18AM  up 46 mins, 0 users, load averages: 0.00, 0.00, 0.00

After running 2:01 hours, restarted again:

reboot at:
Thu Feb 29 04:19:08 CET 2024
 4:19AM  up  2:01, 0 users, load averages: 0.99, 0.56, 0.25


After running 1:01 hours, restarted again:

Thu Feb 29 05:20:08 CET 2024
 5:20AM  up  1:01, 0 users, load averages: 0.08, 0.05, 0.02

Now it is running fine for 7 hours:

netbsd-raspa# uptime
12:55PM  up  7:35, 2 users, load averages: 0.00, 0.00, 0.00



netbsd-raspa# uname -a
NetBSD netbsd-raspa 10.0_RC3 NetBSD 10.0_RC3 (RPI) #0: Tue Jan 16
08:28:51 UTC 2024
mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/evbarm/compile/RPI evbarm
netbsd-raspa#

There is nothing useful in NetBSD  /var/log/messages and /var
directory, but I have a Debian GNU/Linux Raspberry Zero W also running
and recorded this interesting wpa_supplicant events and the times of
the automatic script reboots:


feb 24 03:10:03 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
feb 24 03:10:03 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=GB
feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 24 03:10:04 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=ES
feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-DISCONNECTED bssid=60:8d:26:32:34:24 reason=0
locally_generated=1
feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
feb 29 01:16:06 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=GB
feb 29 01:16:18 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:23 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:29 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:35 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:40 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:46 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:52 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:16:57 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:03 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:09 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:14 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:20 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:25 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:31 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:37 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:42 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:48 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:54 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:17:59 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:05 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:10 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:16 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:22 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:27 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:33 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:39 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:44 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:50 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:18:55 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:01 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:07 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:12 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:18 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:24 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:29 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:35 raspberrypi-meteo wpa_supplicant[597]: wlan0: Failed
to initiate sched scan
feb 29 01:19:41 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
to associate with SSID 'MiFibra-3422'
feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 29 01:19:42 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=ES
feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-DISCONNECTED bssid=60:8d:26:32:34:24 reason=7
feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
to associate with SSID 'MiFibra-3422'
feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
feb 29 01:20:25 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=GB
feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 29 01:20:26 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=ES
feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 29 02:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 29 04:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-DISCONNECTED bssid=60:8d:26:32:34:24 reason=0
locally_generated=1
feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=CORE type=WORLD
feb 29 05:00:14 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=USER type=COUNTRY alpha2=GB
feb 29 05:00:15 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
to associate with SSID 'MiFibra-3422'
feb 29 05:00:15 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-ASSOC-REJECT bssid=00:00:00:00:00:00 status_code=16
feb 29 05:00:16 raspberrypi-meteo wpa_supplicant[597]: wlan0: Trying
to associate with SSID 'MiFibra-3422'
feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
Associated with 60:8d:26:32:34:24
feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-CONNECTED - Connection to 60:8d:26:32:34:24 completed [id=0
id_str=]
feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
feb 29 05:00:17 raspberrypi-meteo wpa_supplicant[597]: wlan0:
CTRL-EVENT-REGDOM-CHANGE init=COUNTRY_IE type=COUNTRY alpha2=ES

Hope this could be useful.

Regards.
Ramiro.


Home | Main Index | Thread Index | Old Index