netbsd-help: Re: SSH and NAT and re-connections.

Subject: Re: SSH and NAT and re-connections.
To: Richard Rauch <rauch@rice.edu>
From: Keith Moore <moore@cs.utk.edu>
List: netbsd-help
Date: 11/15/2002 08:29:23
> math.rice.edu is pretty fixed, no matter what happens to my home machiens.
> (^&  Unless they do a major operation on the server, it *will* be the same
> host, every time.

right, it works for lots of specific hosts.  that doesn't mean it works
in general, and that's a problem if you're talking about automatic 
recovery.
 
> Unless ssh stores the username/password pair, this may not be sufficient
> (who's to say that I'm the same *user*?).  

actually the two ssh processes could establish identities (each one generate
a random number or a public-private key pair) and use those to verify
each other after reconnecting.

> I'm not sure that I care that much about validating that it's the same
> machine.  If the communications channel between the two hosts has gone
> dead for, say, 30 seconds, and a new connection (possibly from a different
> address) says, "Hey, I'm really user `rauch', trying to revive my old ssh
> connection; send all traffic to this channel now", why isn't that enough?

say you have multiple ssh processes talking to a machine.  on one end
(or both) the address changes, and the processes try to reconnect.  
which one connects to which?  keep in mind that those processes might 
be in use by shell scripts, rather than having a human on the other end.
even if you've just got two xterm windows open, wouldn't it be confusing
if they switched functions?  You might have been su-ed to root in one of
them and not in the other, for instance.  (I do this all the time)

> Still, for shell processes I could just use GNU screen.  So my attention
> is focused on X.  I'm not sure how the X protocol would complicate life
> (since I've never gone any lower than Xlib programming, and even then have
> only lightly brushed it).  Maybe making X function acceptably through a
> reconnect (without buffering X protocol commands and replies) wouldn't
> work.

I'd really like for X to have a ability like screen has - I'd like
to be able to take a running X app and move it to another display.
But this appears to be hard.  One problem is that the X app has things
like font ids and window ids that will change if you move the display
to another server.  Another problem is that the X app may be relying
on resources (like fonts) that don't even exist on the other server.
It looks like you really need explicit support in the GUI toolkits
to make it work.  Even then, some X apps invoke other X apps which
use other toolkits, so you might need support from all toolkits
to make it work in general.  And probably some support in the X protocol
as well.

> > broken connections not only requires explicit acks at the application
> > level, it also requires that the sending application buffer all data
> > that is sent to its peer until it gets explicit acks for that data.
> > it also has to implement duplicate message suppression at a layer
> > above TCP.
> 
> Why would it have to deal with duplicate messages?  If the TCP connection
> was broken and a new one was established, nothing more will come from the
> old connection, correct?  

There might have been data that was sent by the app on one end that was never
received by the app on the other end.  That data is no longer in the possession
of the sending app.  The sending hosts's kernel still has it, but it's trying
to send it to the old IP address, and there's no way for the app to recover
it (short of pawing through /dev/kmem).  Even if the app could recover it,
the fact that the sending host hasn't received acknowledgement doesn't
mean that the destination host and process didn't get all or some of the data 
that was queued (the ack could have been dropped).  

Bottom line: reconnecting and resynchronization of a broken TCP connection
in a general fashion is hard.  Something like Mobile IP that works *under* 
TCP is the simplest solution.

Keith