tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
vioif(4) deadlock with softnet_lock
Every now and then an aarch64 VM I run in qemu hangs at boot. It
appears to be deadlocked, but I'm not sure exactly what the deadlock
is.
The symptom is that mdnsd holds softnet_lock and waits for
ctrlq->ctrlq_inuse == FREE with cv_wait in vioif_ctrl_acquire -- all
this is to update the `hardware' multicast filter via:
vioif_set_rx_filter
vioif_rx_filter
vioif_ioctl
if_mcast_op
in_delmulti
ip_freemoptions
in_pcbdetach
udp_detach_wrapper
soclose
soo_close
closef
fd_close
sys_close
At this point, various softint threads are stuck waiting for
softnet_lock, so, e.g. timers no longer fire.
This already seems bad -- cv_wait while holding softnet_lock is
generally forbidden, because cv_wait is forbidden in softint context,
and acquiring any lock from softint context that another thread might
hold across cv_wait is tantamount to doing cv_wait in softint context.
(Changing it to cv_timedwait wouldn't help much, because all callouts
in softclock may get blocked waiting for softnet_lock at which point
the timeout would never fire.)
But as far as I can tell, this only leads to actual deadlock if the
hardware isn't delivering the PCI interrupt that leads vioif_ctrl_intr
to set ctrlq->ctrlq_inuse := DONE and wake vioif_ctrl_acquire.
1. What could be going wrong here to trigger this deadlock? Could
something be missing a virtio interrupt?
2. Can either (a) vioif, or (b) in_pcbdetach / ip_freemoptions /
in_delmulti, be made to avoid cv_wait under softnet_lock? Could
something in that stack safely release softnet_lock, for instance?
Or is it necessary to take softnet_lock in this path at all? This
is likely to cause deadlocks in other network drivers, like
usbnet(9).
Home |
Main Index |
Thread Index |
Old Index