tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: TCP connection and "Host is down"



On Tue, Sep 12, 2023 at 07:36:02PM -0400, Greg Troxel wrote:
> Manuel Bouyer <bouyer%antioche.eu.org@localhost> writes:
> 
> > I wonder if we need this patch to tcp_output.c (not tested yet)
> >
> > Index: netinet/tcp_output.c
> > ===================================================================
> > RCS file: /cvsroot/src/sys/netinet/tcp_output.c,v
> > retrieving revision 1.218
> > diff -u -p -u -r1.218 tcp_output.c
> > --- netinet/tcp_output.c	4 Nov 2022 09:01:53 -0000	1.218
> > +++ netinet/tcp_output.c	11 Sep 2023 14:20:54 -0000
> > @@ -1612,8 +1612,8 @@ out:
> >  			TCP_STATINC(TCP_STAT_SELFQUENCH);
> >  			tcp_quench(tp->t_inpcb);
> >  			error = 0;
> > -		} else if ((error == EHOSTUNREACH || error == ENETDOWN) &&
> > -		    TCPS_HAVERCVDSYN(tp->t_state)) {
> > +		} else if ((error == EHOSTUNREACH || error == ENETDOWN ||
> > +		    error == EHOSTDOWN) && TCPS_HAVERCVDSYN(tp->t_state)) {
> >  			tp->t_softerror = error;
> >  			error = 0;
> >  		}
> >
> 
> I have never understood why these sorts of errors lead to TCP closures.
> I do expect these errors to lead to backoff.    EHOSTDOWN to me is about
> "arp has failed after timing out", but that happens faster than TCP
> gives up.
> 
> So assuming a system with your patch is ok and fixes your issue, LGTM.

I can easily reproduce the issue in a Xen guest, by shutting down the
bridge for one minute in the dom0. The patch fixes the issue.

> 
> > Also, while auditing the code for ENETDOWN or EHOSTUNREACH handling I wonder
> > if we also need this for stcp (but I'm not using stcp ...)
> >
> > Index: netinet/sctp_output.c
> > ===================================================================
> > RCS file: /cvsroot/src/sys/netinet/sctp_output.c,v
> > retrieving revision 1.33
> > diff -u -p -u -r1.33 sctp_output.c
> > --- netinet/sctp_output.c	4 Nov 2022 09:01:53 -0000	1.33
> > +++ netinet/sctp_output.c	11 Sep 2023 14:20:54 -0000
> > @@ -5643,7 +5643,8 @@ sctp_med_chunk_output(struct sctp_inpcb 
> >  							}
> >  							hbflag = 0;
> >  						}
> > -						if (error == EHOSTUNREACH) {
> > +						if (error == EHOSTUNREACH ||
> > +						    error == EHOSTDOWN) {
> >  							/*
> >  							 * Destination went
> >  							 * unreachable during
> > @@ -5921,7 +5922,8 @@ sctp_med_chunk_output(struct sctp_inpcb 
> >  					}
> >  					hbflag = 0;
> >  				}
> > -				if (error == EHOSTUNREACH) {
> > +				if (error == EHOSTUNREACH ||
> > +				    error == EHOSTDOWN) {
> >  					/*
> >  					 * Destination went unreachable during
> >  					 * this send
> 
> looks good.   I bet that it will help 73% of the 0 people using SCTP, so
> assuming it survives a full anita run, go for it.

I didn't notice anything suspect in anita's output (but I didn't see any
sctp-specific tests either).

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index