NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57049: large TCP transfers NetBSD-Xen-Guest -> NetBSD-Xen-DOM0 abort with EHOSTDOWN



Hi Manuel,

that is probably because the DOMU is 9.2 which still had the classic ARP resolution code. In 9.99.x the ARP resolution

was replaced with a neighbour discovery derived code in nd.c. On Xen I tripped over this issue with a 99.100 GENERIC guest quickly. It may be that it

happens with other true network peers also, but I was not able to trigger it with a true network peer right away.

Frank


On 10/07/22 16:35, Manuel Bouyer wrote:
The following reply was made to PR kern/57049; it has been noted by GNATS.

From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/57049: large TCP transfers NetBSD-Xen-Guest ->
  NetBSD-Xen-DOM0 abort with EHOSTDOWN
Date: Fri, 7 Oct 2022 16:30:26 +0200

  On Fri, Oct 07, 2022 at 11:55:00AM +0000, kardel%netbsd.org@localhost wrote:
  > >Description:
  > 	When copying large files (e.g. 20GB) via scp from a Xen guest to a Xen DOM0 the transfers often fail with EHOSTDOWN.
  > 	The errorcode comes from sys/net/nd.c:nd_resolve() (nd.c:384)
  > 	The error can be replicated with a simple ttcp test - see below:
  > 	Also, during the transfer, following routing messages can be observed on the guest:
  > got message of size 152 on Fri Oct  7 11:41:36 2022
  > RTM_MISS: Lookup failed on this address: len 152, pid 0, seq 0, errno 0, flags: 0x40<DONE>
  > locks: 0 inits: 0
  > sockaddrs: 0x3<DST,GATEWAY>
  >  10.0.2.16 link#1
  > got message of size 152 on Fri Oct  7 11:41:39 2022
  > RTM_MISS: Lookup failed on this address: len 152, pid 0, seq 0, errno 0, flags: 0x40<DONE>
  > locks: 0 inits: 0
  > sockaddrs: 0x3<DST,GATEWAY>
  >  10.0.2.16 link#1
  > got message of size 152 on Fri Oct  7 11:41:39 2022
  > RTM_ADD: Add Route: len 152, pid 0, seq 0, errno 0, flags: 0x2445<UP,HOST,DONE,LLINFO,CLONED>
  > locks: 0 inits: 0
  > sockaddrs: 0x3<DST,GATEWAY>
  >  10.0.2.16 aa:bb:cc:dd:ee:ff
  >
  > >How-To-Repeat:
  > 	on DOM0:
  > 	Zugspitze# ttcp -s -r
  > 	ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
  > 	ttcp-r: socket
  >
  > 	on guest:
  > 	Guest# ttcp -s -t -n 1000000 zugspitze
  > 	ttcp-t: socket
  > 	ttcp-t: connect
  > 	ttcp-t: IO: Host is down
  > 	errno=64
I can't reproduce this (-HEAD dom0, 9.3 domU):
  proto:/home/bouyer>./ttcp -s -t -n 1000000 borneo
  ttcp-t: buflen=8192, nbuf=1000000, align=16384/0, port=5001  tcp  -> borneo
  ttcp-t: socket
  ttcp-t: connect
  ttcp-t: 8192000000 bytes in 162.54 real seconds = 49218.69 KB/sec +++
  ttcp-t: 1000000 I/O calls, msec/call = 0.17, calls/sec = 6152.34
  ttcp-t: 3.0user 80.2sys 2:42real 51% 0i+0d 778maxrss 0+2pf 297382+270931csw
--
  Manuel Bouyer <bouyer%antioche.eu.org@localhost>
       NetBSD: 26 ans d'experience feront toujours la difference
  --



Home | Main Index | Thread Index | Old Index