NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57049: large TCP transfers NetBSD-Xen-Guest -> NetBSD-Xen-DOM0 abort with EHOSTDOWN




Yes I think it is somehow related to the new ARP code in nd.c.

New datapoint:
Reversing the transfer to send from DOM0 to guest survive longer. DOM0 is fine. on the guest the EHOSTDOWN return in nd.c:~390 is triggered often. As failure of sending ACK does not terminate TCP connections that is why the connection survives. The timing pattern seems to be a mixture of 200ms (possible ACK re-sends) and ~41 seconds (possibly the nd.c effect).

The EHOSTDOWN pattern looks like this:

[  2174.429719] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2174.639722] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2217.636882] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2217.841108] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2218.051081] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2218.261120] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2259.258196] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2259.462455] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2259.672445] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2300.669503] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2300.873729] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2301.083775] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2301.293752] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2342.290807] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2342.495060] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2342.705039] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2342.915041] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2383.912120] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2384.116365] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2384.326380] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2384.536361] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2427.533491] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2427.737675] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2427.947745] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2428.157763] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2469.154813] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2469.358585] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN
[  2469.568578] /src/NetBSD/999100/src/sys/net/nd.c:391: EHOSTDOWN

Best regards,
  Frank
On 10/07/22 17:10, Manuel Bouyer wrote:
The following reply was made to PR kern/57049; it has been noted by GNATS.

From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: Frank Kardel <kardel%netbsd.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
         netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/57049: large TCP transfers NetBSD-Xen-Guest ->
  NetBSD-Xen-DOM0 abort with EHOSTDOWN
Date: Fri, 7 Oct 2022 17:07:37 +0200

  On Fri, Oct 07, 2022 at 04:43:52PM +0200, Frank Kardel wrote:
  > Hi Manuel,
  >
  > that is probably because the DOMU is 9.2 which still had the classic ARP
  > resolution code. In 9.99.x the ARP resolution
  >
  > was replaced with a neighbour discovery derived code in nd.c. On Xen I
  > tripped over this issue with a 99.100 GENERIC guest quickly. It may be that
  > it
  >
  > happens with other true network peers also, but I was not able to trigger it
  > with a true network peer right away.
OK, with a HEAD domU I can reproduce this.
  But I don't think this is Xen-specific. Maybe it's just some timing or
  ressource issue that makes it more likely to happen on Xen.
--
  Manuel Bouyer <bouyer%antioche.eu.org@localhost>
       NetBSD: 26 ans d'experience feront toujours la difference
  --




Home | Main Index | Thread Index | Old Index