[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/54189: IFNET_LOCK for if_mcast_op can cause a deadlock
>Synopsis: IFNET_LOCK for if_mcast_op can cause a deadlock
>Arrival-Date: Fri May 10 09:25:00 +0000 2019
>Originator: Ryota Ozaki
>Release: current and 8
System: NetBSD rangeley 8.99.38 NetBSD 8.99.38 (RANGELEY) #2: Fri May 10 13:37:40 JST 2019 ozaki-r@rangeley:(hidden) amd64
To avoid data race on if_flags it should be protected by a lock (currently it's
IFF_ALLMULTI can be set/unset via if_mcast_op. To protect updates of the flag,
we had added IFNET_LOCK around if_mcast_op. However that was not a good
approach because if_mcast_op is typically called in the middle of a call path
and holding IFNET_LOCK such places is problematic. Actually a deadlock is
observed. Probably we should remove IFNET_LOCK and manage IFF_ALLMULTI
somewhere other than if_flags, for example ethercom or driver itself (or a
common driver framework once it appears). Such a change is feasible because
IFF_ALLMULTI is only set/unset by a driver and not accessed from any common
components such as network protocols.
The resource dependency graph of a deadlock I encountered is like this:
Thread A: IFNET_LOCK => psref_target_destroy => softint
Thread B: softint => softnet_lock
Thread C: softnet_lock => IFNET_LOCK (for if_mcast_op)
Note that the deadlock doesn't occur if NET_MPSAFE is NOT enabled because
psref_target_destroy is called only if NET_MPSAFE is enabled.
Run userspace processes below concurrently while receive some packets destinating to the host, then wait several hours:
- a process adding/removing an IP address on an interface, and
- a process configuring/unconfiguring multicast groups on the same interface.
One solution is to remove IFNET_LOCK for if_mcat_op as described above and
protect operations of IFF_ALLMULTI in another way.
The below patch introduces a new field to struct ethercom (ec_flags) and store
the IFF_ALLMULTI flag to it (as ETHER_F_ALLMULTI). So we can use ETHER_LOCK
to protect operations of the flag instead of IFNET_LOCK.
One known issue of the approach is that it can't be applied to netbsd-8 because
the change breaks ABI IIUC.
Main Index |
Thread Index |