NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/60074: xhci(4): xhci_new_device() may remove roothub entry on early failure



>Number:         60074
>Category:       kern
>Synopsis:       xhci(4): xhci_new_device() may remove roothub entry on early failure
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 12 19:15:00 +0000 2026
>Originator:     Izumi Tsutsui
>Release:        NetBSD 10.1
>Organization:
>Environment:
System: NetBSD mirage 10.1 NetBSD 10.1 (GENERIC) #3: Mon Jan 5 01:48:52 JST 2026 tsutsui@mirage:/s/netbsd-10/src/sys/arch/i386/compile/GENERIC i386
Architecture: all (tested on i386 and amd64)
Machine: all (S/A)
>Description:

On xHCI, xhci_new_device() initializes a new child device
with dev->ud_addr = 0 (see PR/60073) and sets
`up->up_dev = dev`
before the device address is actually assigned.

---
static usbd_status
xhci_new_device(device_t parent, struct usbd_bus *bus, int depth,
    int speed, int port, struct usbd_port *up)
{
 :

	dev->ud_addr = 0;

 :

	up->up_dev = dev;

 :

	if (depth == 0 && port == 0) {
		KASSERT(bus->ub_devices[USB_ROOTHUB_INDEX] == NULL);
		bus->ub_devices[USB_ROOTHUB_INDEX] = dev;

		/* Establish the default pipe. */
		err = usbd_setup_pipe(dev, 0, &dev->ud_ep0,
		    USBD_DEFAULT_INTERVAL, &dev->ud_pipe0);
		if (err) {
			DPRINTFN(1, "setup default pipe failed %jd", err,0,0,0);
			goto bad;
		}
		err = usbd_get_initial_ddesc(dev, dd);
		if (err) {
			DPRINTFN(1, "get_initial_ddesc %ju", err, 0, 0, 0);
			goto bad;
		}

---

For non-root devices, there are multiple failure paths before
xhci_set_address() succeeds and before dev->ud_addr is updated from the
slot context.

---
	} else {
		uint8_t slot = 0;

		/* 4.3.2 */
		err = xhci_enable_slot(sc, &slot);
		if (err) {
			DPRINTFN(1, "enable slot %ju", err, 0, 0, 0);
			goto bad;
                        ^^^^^^^^^ (1)
		}

		xs = &sc->sc_slots[slot];
		dev->ud_hcpriv = xs;

		/* 4.3.3 initialize slot structure */
		err = xhci_init_slot(dev, slot);
		if (err) {
			DPRINTFN(1, "init slot %ju", err, 0, 0, 0);
			dev->ud_hcpriv = NULL;
			/*
			 * We have to disable_slot here because
			 * xs->xs_idx == 0 when xhci_init_slot fails,
			 * in that case usbd_remove_dev won't work.
			 */
			mutex_enter(&sc->sc_lock);
			xhci_disable_slot(sc, slot);
			mutex_exit(&sc->sc_lock);
			goto bad;
                        ^^^^^^^^^ (2)
		}

		/*
		 * We have to establish the default pipe _after_ slot
		 * structure has been prepared.
		 */
		err = usbd_setup_pipe(dev, 0, &dev->ud_ep0,
		    USBD_DEFAULT_INTERVAL, &dev->ud_pipe0);
		if (err) {
			DPRINTFN(1, "setup default pipe failed %jd", err, 0, 0,
			    0);
			goto bad;
                        ^^^^^^^^^ (3)
		}

		/* 4.3.4 Address Assignment */
		err = xhci_set_address(dev, slot, false);
		if (err) {
			DPRINTFN(1, "failed! to set address: %ju", err, 0, 0, 0);
			goto bad;
                        ^^^^^^^^^ (4)
		}

		/* Allow device time to set new address */
		usbd_delay_ms(dev, USB_SET_ADDRESS_SETTLE);

		usb_syncmem(&xs->xs_dc_dma, 0, sc->sc_pgsz, BUS_DMASYNC_POSTREAD);
		cp = xhci_slot_get_dcv(sc, xs, XHCI_DCI_SLOT);
		HEXDUMP("slot context", cp, sc->sc_ctxsz);
		uint8_t addr = XHCI_SCTX_3_DEV_ADDR_GET(le32toh(cp[3]));
		DPRINTFN(4, "device address %ju", addr, 0, 0, 0);
		/*
		 * XXX ensure we know when the hardware does something
		 * we can't yet cope with
		 */
		KASSERTMSG(addr >= 1 && addr <= 127, "addr %d", addr);
		dev->ud_addr = addr;
                ^^^^^^^^^^^^^^^^^^^  dev->ud_addr isn't set until here

---

In those paths, the function jumps to "bad:" and calls
usbd_remove_device(dev, up).

---
 bad:
	if (err != USBD_NORMAL_COMPLETION) {
		if (depth == 0 && port == 0 && dev->ud_pipe0)
			usbd_kill_pipe(dev->ud_pipe0);
		usbd_remove_device(dev, up);
	}
---

However, usbd_remove_device() unconditionally clears:
---
	dev->ud_bus->ub_devices[usb_addr2dindex(dev->ud_addr)] = NULL;
---

So if dev->ud_addr is still '0' at failure time, this clears the entry
corresponding to usb_addr2dindex(0), which on xHCI currently aliases
the root-hub slot (ub_devices[1]), because the xHCI roothub is stored
at ub_devices[USB_ROOTHUB_INDEX] while keeping ud_addr == 0.

As a result, a failed child enumeration can accidentally(?) remove
the roothub from ub_devices[]. After that, userland enumeration
through USB_DEVICEINFO can no longer see the roothub on that bus,
and tools that scan devices by address may fail to find subsequently
attached devices.

This was observed with a CH32V003/UIAPduino board on an xHCI controller:
repeated failed enumerations on one port eventually left usb1 with both
ub_devices[0] and ub_devices[1] as NULL, and later a successfully
attached HID device appeared at a higher address but could not be found by
a userland scanner expecting the roothub entry to remain present.

The issue appears to be that xhci_new_device() uses usbd_remove_device()
for cleanup even before the device has been registered in ub_devices[]
with a valid nonzero USB address.

>How-To-Repeat:

1) Boot NetBSD/amd64 on a machine using xHCI for the target USB port.

2) Add temporary debug prints in xhci_new_device() and usbd_remove_device()
   to show:

   - dev->ud_addr
   - up->up_dev
   - ub_devices[0]
   - ub_devices[1]

    the entry being cleared by usbd_remove_device()

3) Attach a device that frequently fails during early enumeration before
   xhci_set_address() completes. One reproducible case is a
   CH32V003/UIAPduino board whose software USB implementation may
   disconnect mid-enumeration.

4) Observe repeated logs like:

xhci_new_device: bad bus=usb1 ... err=13 ud_addr=0 ...
usbd_remove_device before-clear-bus: ... ud_addr=0 ... ub_devices[1]=<roothub>
usbd_remove_device after-clear-bus: ... ub_devices[1]=0x0

5) After this, USB_DEVICEINFO on that xHCI bus no longer reports
   the roothub, and userland enumeration tools may fail to find
   devices attached later on that bus.

>Fix:
Don't call usbd_remove_device if the failed device is not roothub
and device address is not assigned?

---
 bad:
	if (err != USBD_NORMAL_COMPLETION) {
		if (depth == 0 && port == 0 && dev->ud_pipe0)
			usbd_kill_pipe(dev->ud_pipe0);
		if ((depth == 0 && port == 0) ||
		    dev->ud_addr != 0) {
			usbd_remove_device(dev, up);
		} else {
			up->up_dev = NULL;
			usb_free_device(dev);
		}
	}
---

Or "don't use addr=0 for roothub" (to make usbd_remove_device() safe
as usbd_new_device() in usb_subr.c does), as mentioned in PR/60073?

---
Izumi Tsutsui



Home | Main Index | Thread Index | Old Index