Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kernel deadlock on fstchg with vnd



> On 29. May 2022, at 23:57, Manuel Bouyer <bouyer%antioche.eu.org@localhost> wrote:
> 
> On Sun, May 29, 2022 at 01:18:16PM +0200, J. Hannken-Illjes wrote:
>>> On 29. May 2022, at 08:30, Michael van Elst <mlelstv%serpens.de@localhost> wrote:
>>> 
>>> bouyer%antioche.eu.org@localhost (Manuel Bouyer) writes:
>>> 
>>>> Hello,
>>>> do you have an idea on the problem in this thread:
>>>> http://mail-index.netbsd.org/port-xen/2022/05/27/msg010213.html
>>> [...]
>>>> I can't reproduce this when using vnd from userland.
>>> 
>>> You can replicate it by addressing the block device with vnconfig.
>>> 
>>> A workaround would be to modify the Xen block script to select the
>>> raw device:
>>> 
>>> vnconfig /dev/r${disk}d $xparams >/dev/null; then
>>> 
>>> or just the disk name:
>>> 
>>> vnconfig ${disk} $xparams >/dev/null; then
>> 
>> Good catch, sys/dev/vnd.c has this:
>> 
>>  1751  static void
>>  1752  vndclear(struct vnd_softc *vnd, int myminor)
>>  1753  {
>>  1754          struct vnode *vp = vnd->sc_vp;
>>  1755          int fflags = FREAD;
>>  1756          int bmaj, cmaj, i, mn;
>>  1757          int s;
>>  1758
>>  1759  #ifdef DEBUG
>>  1760          if (vnddebug & VDB_FOLLOW)
>>  1761                  printf("vndclear(%p): vp %p\n", vnd, vp);
>>  1762  #endif
>>  1763          /* locate the major number */
>>  1764          bmaj = bdevsw_lookup_major(&vnd_bdevsw);
>>  1765          cmaj = cdevsw_lookup_major(&vnd_cdevsw);
>>  1766
>>  1767          /* Nuke the vnodes for any open instances */
>>  1768          for (i = 0; i < MAXPARTITIONS; i++) {
>>  1769                  mn = DISKMINOR(device_unit(vnd->sc_dev), i);
>>  1770                  vdevgone(bmaj, mn, mn, VBLK);
>>  1771                  if (mn != myminor) /* XXX avoid to kill own vnode */
>>  1772                          vdevgone(cmaj, mn, mn, VCHR);
>>  1773          }
>> 
>> The "skip myself" on lines 1771/1772 is responsible for this behaviour.
> 
> Yes and doing the same for block devices avoids the issue.
> But Taylor is reluctant to commit this hack.

And he is right.  It smells fishy to detach a (pseudo) device from
an open instance of itself, either with ioctl or close.

Why do we detach on last close -- isn't it sufficient to detach
either explicit with drvctl(8) or on module unload?

The attached diff moves vdevgone() to vnd_detach() and no longer
detaches on last close -- comments?

--
J. Hannken-Illjes - hannken%mailbox.org@localhost

Attachment: vnd.c.diff
Description: Binary data

Attachment: signature.asc
Description: Message signed with OpenPGP



Home | Main Index | Thread Index | Old Index