tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: network-related deadlock



On Fri, Jan 26, 2018 at 11:20 PM, Maxime Villard <max%m00nbsd.net@localhost> wrote:
> There appears to be a network deadlock somewhere in the kernel. I've
> narrowed
> the issue down to the following situation: if you have a NetBSD vm on
> VirtualBox, with one CPU, and one enabled network card that is not attached
> (Settings->Network->Adapter_1->Attached_To = Not attached), the kernel
> freezes
> ~ten seconds after booting.
>
> I can log in, type a few commands, and then the keyboard does not answer
> anymore
> and system deadlocks (no pings either).
>
> I've disassembled %rip, it points to x86_pause(). So there must be a
> deadlock.
>
> If the card settings are switched to "Bridged Adapter" there is no deadlock.
>
> A kernel from December 27 works fine.

I could reproduce the issue (not sure the same one). I used VirtualBox 5.2.6
on macOS Sierra. The issue also happened on a two core system. It happened
with a network adapter with "Not attached" configuration, and didn't happen
with "NAT" configuration.

I tried four network adapters, Intel 82540, 82543, 82545 ana virtio.
With 82540, the system hanged but I could enter the DDB and got a stack trace:
  db{0}> bt/a ffffe4007fbf0860
  trace: pid 0 lid 5 at 0xffff800043b2bcc8
  breakpoint() at netbsd:breakpoint+0x5
  comintr() at netbsd:comintr+0x746
  handle_ioapic_edge8() at netbsd:handle_ioapic_edge8+0x66
  wm_watchdog() at netbsd:wm_watchdog+0x3c
  if_slowtimo() at netbsd:if_slowtimo+0x6d
  callout_softclock() at netbsd:callout_softclock+0x41c
  softint_dispatch() at netbsd:softint_dispatch+0xd3
  DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffff800043b2bff0
  Xsoftintr() at netbsd:Xsoftintr+0x4f
  --- interrupt ---
  0:

With 82543 and 82545, I couldn't enter the DDB.

With virtio, the system didn't hang but dhcpcd had stuck in the virtio driver:
  db{0}> bt/a ffffe4007effa660
  trace: pid 116 lid 1 at 0xffff800044a8bb30
  sleepq_block() at netbsd:sleepq_block+0x97
  cv_wait() at netbsd:cv_wait+0xfb
  vioif_ctrl_rx() at netbsd:vioif_ctrl_rx+0x1cb
  vioif_init() at netbsd:vioif_init+0xf7
  vioif_ioctl() at netbsd:vioif_ioctl+0x2f
  doifioctl() at netbsd:doifioctl+0x824
  sys_ioctl() at netbsd:sys_ioctl+0x101
  syscall() at netbsd:syscall+0x1d8
  --- syscall (number 54) ---
  73339f71a26a:

Then I broke the boot loader of my system and I had no progress after that...

  ozaki-r


Home | Main Index | Thread Index | Old Index