NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/60182: ld@virtio sometimes hangs up



>Number:         60182
>Category:       kern
>Synopsis:       ld@virtio sometimes hangs up
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 08 03:10:00 +0000 2026
>Originator:     Tetsuya Isaki
>Release:        NetBSD-current
>Organization:
>Environment:
NetBSD virt68k
>Description:
On -current (20260406), ld@virtio on virt68k (on nono emulator)
sometimes hangs up while device attach.

According to a comment in virtio_dequeue(), this function assumes
that the caller performs the necessary dmamap_sync operation.

sys/dev/pci/virtio.c:
 1306 /*
 1307  * dequeue: dequeue a request from uring; dmamap_sync for uring is
 1308  *          already done in the interrupt handler.
 1309  */
 1310 int
 1311 virtio_dequeue(struct virtio_softc *sc, struct virtqueue *vq,
 1312     int *slotp, int *lenp)

But ld@virtio attachment doesn't seem to satisfy this assumption.
 
 ld_virtio_attach()
 -> ld_virtio_info( poll = true )
 -> ld_virtio_vq_done()
 -> virtio_dequeue()

ld_virtio_info() calls virtio_dequeue() (via ld_virtio_vq_done())
without any dmamap_sync.
If virtio_dequeue() reads an "empty state" from memory (and that value
becomes cached), subsequent calls to virtio_dequeue() will repeatedly
read the stale "empty" from the cache.  As a result, this while loop
will never terminate.

sys/dev/pci/ld_virtio.c:
  481 static int __used
  482 ld_virtio_info(struct ld_softc *ld, bool poll)
  :
  566 done:
  567     mutex_enter(&sc->sc_sync_wait_lock);
  568     while (sc->sc_sync_use != SYNC_DONE) {
  569         if (poll) {
  570             mutex_exit(&sc->sc_sync_wait_lock);
  571             ld_virtio_vq_done(vq);
  572             mutex_enter(&sc->sc_sync_wait_lock);
  573             continue;
  574         }
  575         cv_wait(&sc->sc_sync_wait, &sc->sc_sync_wait_lock);
  576     }

Here, virtio_dequeue() returns errno if the queue was empty.

sys/dev/pci/ld_virtio.c:
  748 static int
  749 ld_virtio_vq_done(struct virtqueue *vq)
  750 {
  751     struct virtio_softc *vsc = vq->vq_owner;
  752     struct ld_virtio_softc *sc = device_private(virtio_child(vsc));
  753     int r = 0;
  754     int slot;
  755
  756 again:
  757     if (virtio_dequeue(vsc, vq, &slot, NULL))
  758         return r;
  759     r = 1;
  760
  761     ld_virtio_vq_done1(sc, vsc, vq, slot);
  762     goto again;
  763 }


Here is the dmesg:

[   1.0000000] Initialized Goldfish TTY console @ 0xff030000
[   1.0000000] entropy: ready
[   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[   1.0000000]     2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
[   1.0000000]     2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023,
[   1.0000000]     2024, 2025, 2026
[   1.0000000]     The NetBSD Foundation, Inc.  All rights reserved.
[   1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[   1.0000000]     The Regents of the University of California.  All rights reserved.

[   1.0000000] NetBSD 11.99.5 (GENERIC) #9: Mon Apr  6 14:34:28 JST 2026
[   1.0000000]  isaki@XXX:XXX/current/virt68k/obj/sys/arch/virt68k/compile/GENERIC
[   1.0000000] Qemu 0.0.0: MC68030+MMU, MC68881 FPU
[   1.0000000] total memory = 32768 KB
[   1.0000000] avail memory = 27864 KB
[   1.0000000] mainbus0 (root)
[   1.0000000] gfpic0 at mainbus0 addr 0xff011000: Google Goldfish PIC
[   1.0000000] gfpic0: interrupting at IPL 1
[   1.0000000] gfpic1 at mainbus0 addr 0xff012000: Google Goldfish PIC
[   1.0000000] gfpic1: interrupting at IPL 2
[   1.0000000] gfpic2 at mainbus0 addr 0xff013000: Google Goldfish PIC
[   1.0000000] gfpic2: interrupting at IPL 3
[   1.0000000] gfpic3 at mainbus0 addr 0xff014000: Google Goldfish PIC
[   1.0000000] gfpic3: interrupting at IPL 4
[   1.0000000] gfpic4 at mainbus0 addr 0xff015000: Google Goldfish PIC
[   1.0000000] gfpic4: interrupting at IPL 5
[   1.0000000] gfpic5 at mainbus0 addr 0xff016000: Google Goldfish PIC
[   1.0000000] gfpic5: interrupting at IPL 6
[   1.0000000] gfrtc0 at mainbus0 addr 0xff020000: Google Goldfish RTC + timer
[   1.0000000] gfrtc0: hardclock interrupting at gfpic5 irq 1 (IPL 6)
[   1.0000000] gfrtc0: Using as delay() timer.
[   1.0000000] gfrtc1 at mainbus0 addr 0xff021000: Google Goldfish RTC + timer
[   1.0000000] gfrtc1: using as Time of Day Register.
[   1.0000000] gftty0 at mainbus0 addr 0xff030000: Google Goldfish TTY
[   1.0000000] gftty0: console
[   1.0000000] gftty0: interrupting at gfpic0 irq 32 (IPL 1)
[   1.0000000] virtctrl0 at mainbus0 addr 0xff040000: Qemu Virtual System Controller
[   1.0000000] virtctrl0: features=0x00000001
[   1.0000000] virtio0 at mainbus0 addr 0xff07c000
[   1.0000000] virtio0: VirtIO-MMIO-v2
[   1.0000000] virtio0: network device (id 1, rev. 0x01)
[   1.0000000] vioif0 at virtio0: features: 0x110000020<V1,INDIRECT_DESC,MAC>
[   1.0000000] vioif0: Ethernet address 02:00:00:00:00:44
[   1.0000000] virtio0: interrupting at gfpic4 irq 1 (IPL 5)
[   1.0000000] virtio1 at mainbus0 addr 0xff07c200
[   1.0000000] virtio1: VirtIO-MMIO-v2
[   1.0000000] virtio1: entropy device (id 4, rev. 0x01)
[   1.0000000] viornd0 at virtio1: features: 0x110000000<V1,INDIRECT_DESC>
[   1.0000000] virtio1: interrupting at gfpic4 irq 2 (IPL 5)
[   1.0000000] virtio2 at mainbus0 addr 0xff07c400
[   1.0000000] virtio2: VirtIO-MMIO-v2
[   1.0000000] virtio2: block device (id 2, rev. 0x01)
[   1.0000000] ld0 at virtio2: features: 0x110000044<V1,INDIRECT_DESC,BLK_SIZE,SEG_MAX>
[   1.0000000] ld0: max 30 segs of max 65536 bytes
[   1.0000000] virtio2: interrupting at gfpic4 irq 3 (IPL 5)
[   1.0000000] ld0: 2048 MB, 1040 cyl, 64 head, 63 sec, 512 bytes/sect x 4194304 sectors
[   1.0000000] virtio3 at mainbus0 addr 0xff07c600
[   1.0000000] virtio3: VirtIO-MMIO-v2
[   1.0000000] virtio3: block device (id 2, rev. 0x01)
[   1.0000000] ld1 at virtio3: features: 0x110000044<V1,INDIRECT_DESC,BLK_SIZE,SEG_MAX>
[   1.0000000] ld1: max 30 segs of max 65536 bytes
[   1.0000000] virtio3: interrupting at gfpic4 irq 4 (IPL 5)
(hang up...)

>How-To-Repeat:
Boot virt68k-current on nono emulator.

>Fix:
N/A




Home | Main Index | Thread Index | Old Index