Subject: Re: VERY slow ssh logins to uVAX
To: NetBSD/vax Discussion List <port-vax@NetBSD.ORG>
From: Klaus Klein <kleink@mibh.de>
List: port-vax
Date: 05/08/2005 18:55:51
Greg A. Woods wrote:
> [ On Friday, May 6, 2005 at 23:10:39 (+0200), Klaus Klein wrote: ]
> > Subject: Re: VERY slow ssh logins to uVAX
> >
> > On Friday, 6. May 2005 02:04, Greg A. Woods wrote:
> > > [ On Thursday, May 5, 2005 at 15:01:40 (-0700), Aaron J. Grier wrote: ]
> > > > which crypto algorithms use floating point?
> > > 
> > > I don't really know for sure (esp. w.r.t. OpenSSL and OpenSSH), other
> > > than what "egrep 'float|double' */*" shows in src/dist/openssl/crypto).
> > > (which seems to be most of them :-)
> > 
> > What you're seeing there are the timing calculations of the
> > algorithms' _benchmark_ modules.
> 
> Yes, that could well be.

Then you should check again, rather than relying on vague,
indiscriminating statements (be it yours or mine).

> In any case the problem is endemic to OpenSSH and it is easily solved by
> using SSH-3.x from pkgsrc/security/ssh2 instead.  :-)
> 
> This thread more or less mirrors one a couple years ago:
> 
> 	http://mail-index.netbsd.org/netbsd-users/2002/03/05/0008.html

As the problem happens to be the Diffie-Hellman Group Exchange
method implemented in OpenSSH, in an ideal world you could work
around it by using a run-time means to select a modulus of a
certain size that's both appropriate and useable, or to simply
disable it.  Alas, neither is possible with that implementation
(and the closest to it you could get was to edit /etc/moduli to
that effect), but nowadays OpenSSH has become a little less
conservative than it used to be in its choice of a modulus size
estimatedly corresponding to the strength of the symmetric cipher
that is going to be used for transport (previously 1024 bits for
<128-bit ciphers, nowadays for <=128-bit ciphers - this makes
quite a difference with the common aes128-cbc choice).

> There may indeed also be some way to improve the performance of OpenSSH
> by using better compiler optimization flags, especially on sparcs where
> there seems to be a wider diversity of performance-related CPU features
> over the years.  However I don't think anyone has yet had the patience
> to benchmark the differences, if any, between an ideally optimized build
> of OpenSSH against an identically compiled SSH-3.x.

What you're referring to are, for the most part, the SPARC v8 CPUs
implementing mul/div/rem which had to be software-assisted in v7,
and which obviously had an impact on BigNum operations (and thus
the DH exchange).  Anyway, since 2.0 NetBSD has been shipping an
ld.so.conf that (where applicable) preloads a shared object which
makes use of the hardware.

> All I know is that 
> I was surprised by the complaints in the above referenced thread until I
> realized that those folks were using OpenSSH and I wasn't seeing
> any similar problems because I was using SSH-3.x  ;-)

SSH<=3.2.9.1 doesn't implement the DH GEX, so that's hardly surprising.
(I haven't checked other SSHv2 implementations recently.)


But to move back the discussion to improving things for slow
systems, Ben Harris recently proposed a new, more CPU-friendly
key exchange method and a pepped-up arcfour transport cipher,
but see for yourself at
http://www.ietf.org/internet-drafts/draft-harris-ssh-rsa-kex-01.txt and
http://www.ietf.org/internet-drafts/draft-harris-ssh-arcfour-fixes-02.txt .



- Klaus