NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57049: large TCP transfers NetBSD-Xen-Guest -> NetBSD-Xen-DOM0 abort with EHOSTDOWN



The following reply was made to PR kern/57049; it has been noted by GNATS.

From: Frank Kardel <kardel%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost,
 gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Cc: 
Subject: Re: kern/57049: large TCP transfers NetBSD-Xen-Guest ->
 NetBSD-Xen-DOM0 abort with EHOSTDOWN
Date: Fri, 7 Oct 2022 16:43:52 +0200

 Hi Manuel,
 
 that is probably because the DOMU is 9.2 which still had the classic ARP 
 resolution code. In 9.99.x the ARP resolution
 
 was replaced with a neighbour discovery derived code in nd.c. On Xen I 
 tripped over this issue with a 99.100 GENERIC guest quickly. It may be 
 that it
 
 happens with other true network peers also, but I was not able to 
 trigger it with a true network peer right away.
 
 Frank
 
 
 On 10/07/22 16:35, Manuel Bouyer wrote:
 > The following reply was made to PR kern/57049; it has been noted by GNATS.
 >
 > From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
 > To: gnats-bugs%netbsd.org@localhost
 > Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
 > Subject: Re: kern/57049: large TCP transfers NetBSD-Xen-Guest ->
 >   NetBSD-Xen-DOM0 abort with EHOSTDOWN
 > Date: Fri, 7 Oct 2022 16:30:26 +0200
 >
 >   On Fri, Oct 07, 2022 at 11:55:00AM +0000, kardel%netbsd.org@localhost wrote:
 >   > >Description:
 >   > 	When copying large files (e.g. 20GB) via scp from a Xen guest to a Xen DOM0 the transfers often fail with EHOSTDOWN.
 >   > 	The errorcode comes from sys/net/nd.c:nd_resolve() (nd.c:384)
 >   > 	The error can be replicated with a simple ttcp test - see below:
 >   > 	Also, during the transfer, following routing messages can be observed on the guest:
 >   > got message of size 152 on Fri Oct  7 11:41:36 2022
 >   > RTM_MISS: Lookup failed on this address: len 152, pid 0, seq 0, errno 0, flags: 0x40<DONE>
 >   > locks: 0 inits: 0
 >   > sockaddrs: 0x3<DST,GATEWAY>
 >   >  10.0.2.16 link#1
 >   > got message of size 152 on Fri Oct  7 11:41:39 2022
 >   > RTM_MISS: Lookup failed on this address: len 152, pid 0, seq 0, errno 0, flags: 0x40<DONE>
 >   > locks: 0 inits: 0
 >   > sockaddrs: 0x3<DST,GATEWAY>
 >   >  10.0.2.16 link#1
 >   > got message of size 152 on Fri Oct  7 11:41:39 2022
 >   > RTM_ADD: Add Route: len 152, pid 0, seq 0, errno 0, flags: 0x2445<UP,HOST,DONE,LLINFO,CLONED>
 >   > locks: 0 inits: 0
 >   > sockaddrs: 0x3<DST,GATEWAY>
 >   >  10.0.2.16 aa:bb:cc:dd:ee:ff
 >   >
 >   > >How-To-Repeat:
 >   > 	on DOM0:
 >   > 	Zugspitze# ttcp -s -r
 >   > 	ttcp-r: buflen=8192, nbuf=2048, align=16384/0, port=5001  tcp
 >   > 	ttcp-r: socket
 >   >
 >   > 	on guest:
 >   > 	Guest# ttcp -s -t -n 1000000 zugspitze
 >   > 	ttcp-t: socket
 >   > 	ttcp-t: connect
 >   > 	ttcp-t: IO: Host is down
 >   > 	errno=64
 >   
 >   I can't reproduce this (-HEAD dom0, 9.3 domU):
 >   proto:/home/bouyer>./ttcp -s -t -n 1000000 borneo
 >   ttcp-t: buflen=8192, nbuf=1000000, align=16384/0, port=5001  tcp  -> borneo
 >   ttcp-t: socket
 >   ttcp-t: connect
 >   ttcp-t: 8192000000 bytes in 162.54 real seconds = 49218.69 KB/sec +++
 >   ttcp-t: 1000000 I/O calls, msec/call = 0.17, calls/sec = 6152.34
 >   ttcp-t: 3.0user 80.2sys 2:42real 51% 0i+0d 778maxrss 0+2pf 297382+270931csw
 >   
 >   --
 >   Manuel Bouyer <bouyer%antioche.eu.org@localhost>
 >        NetBSD: 26 ans d'experience feront toujours la difference
 >   --
 >   
 


Home | Main Index | Thread Index | Old Index