Subject: Re: muhah
To: Trevor Johnson <trevor@jpj.net>
From: Alistair Crooks <agc@pkgsrc.org>
List: tech-pkg
Date: 03/26/2001 16:07:09
On Fri, Mar 23, 2001 at 12:17:07PM -0500, Trevor Johnson wrote:
> > > You're right, the copy that comes with NetBSD 1.5 for i386 (base.tgz) is
> > > 231292 bytes.  Still, pkgsrc itself is much larger than that.
> 
> > (and that includes CVS dirs, but excludes Makefile.openssl).  You're
> > seriously telling me I should add well over 10 Megabytes of source to
> > a single package in pkgsrc? OK, so on to executable sizes, which isn't
> > really relevant:
> 
> I understand that OpenSSL, whether as source or binary, is much bigger
> than your digest utility. So are other things--sh, make, tar and pax, awk,
> patch, cc, and so on--which are needed for the use of pkgsrc.  If, for
> example, the user hasn't installed a C compiler (not part of the base
> system on NetBSD, Solaris, or most Linux distributions), pkgsrc doesn't
> bootstrap one into place.  If the user installs the wrong C compiler--a
> buggy, old, or trojaned one (food for thought:
> http://www.acm.org/classics/sep95/)--he loses.

If the user hasn't installed a C compiler, then they're going to have
an interesting but unrewarding time building applications from source
using pkgsrc.
 
> > > > 2.  openssl produces output in a slightly different format from
> > > > md5(1).  I really don't want to have to pre-process everything with
> > > > sed or awk or expr.
> > >
> > > Can't the existing md5 utility still handle the MD5 hashes which were
> > > generated with it?  For the SHA-1 and RIPEMD-160 hashes, would you
> > > consider making the output of your digest utility have the same format as
> > > the output from OpenSSL?
> >
> > It's not a huge problem, it's a minor niggle. Yes, we can massage output
> > with sed, awk or expr. But I don't want to do that.
> 
> You misunderstand.  What I requested is output from "digest sha1 foo" in
> the format that "openssl dgst -sha1 foo" has, and likewise for "digest
> rmd160 foo" to have the same format as "openssl dgst -rmd160".  That way,
> if it ever becomes desirable to use OpenSSL for hashing--for instance, in
> a future world where pre-1999 versions of NetBSD needn't to be fully
> supported--such massaging will not be necessary.  I've appended a trivial
> patch which does this.  For the SHA-1 and RIPEMD-160 hashes, the slightly
> different output is unnecessary.  MD5 hashes have already been calculated,
> so I don't propose changing them.

Sorry about the misunderstanding.  But when there are over 3000 files
which have been prepared using a standard NetBSD utility called
md5(1), it's a bit of an onerous job to have to reformat those 3000+
files, just because openssl outputs in a different format.  Yes, we
can massage the output by using sed, awk or expr, but, frankly, I
don't want to.  Have a look at the logic used to identify checksums
and checksum algorithms in bsd.pkg.mk, and you'll see why I don't want
yet another format to parse.  I'm sorry, but I thought I'd put that in
my original mail.

> > > > 3.  I want a message digest calculation utility that is small and
> > > > quick, and something that either is present on all Operating Systems
> > > > on which pkgsrc runs, or is buildable on those operating systems with
> > > > minimum fuss. openssl does not really fit the bill here.
> 
> > In the whole scheme of things, though, with all the other processing
> > that is taking place at that time, the speed of the code produced by
> > an optimising compiler vs.  hand-tuned assembly code is fairly low on
> > my list of priorities.
> 
> Thanks for dropping the objection.

Au contraire - I have not dropped any objection.

And I do resent such an implication.

My apologies, however, if you're just being disingenuous.

What I DID write, however, was:

> Now I'm not a statistician, but the sample size isn't huge, I don't
> know what was taking place on your machine while you were taking your
> measurements, and, all in all, the results aren't all that conclusive,
> are they? I mean, to take your figures above, digest takes 65% of the
> time of openssl to calculate sha1 digests.
> 
> OTOH, I wouldn't be that surprised if openssl was actually quicker,
> since openssl uses assembly code on the more popular architectures. 
> In the whole scheme of things, though, with all the other processing
> that is taking place at that time, the speed of the code produced by
> an optimising compiler vs.  hand-tuned assembly code is fairly low on
> my list of priorities.

Perhaps you'd like to go away and cut down openssl so that it's < 200K
in source form, is quick to compile, is statically linked, has no
patches to be applied, and provides md5(1)-compatible output and is
distributed with a BSD licence, then we'll consider switching over to
using openssl as part of pkgsrc.

Thanks,
Alistair

> -- 
> Trevor Johnson
> http://jpj.net/~trevor/gpgkey.txt
> 
> --- digest.c.orig	Fri Mar  9 13:24:49 2001
> +++ digest.c	Fri Mar 23 08:04:20 2001
> @@ -95,7 +95,7 @@
>  		if (SHA1File(fn, digest) == NULL) {
>  			return 0;
>  		}
> -		(void) printf("SHA1 (%s) = %s\n", fn, digest);
> +		(void) printf("SHA1(%s)= %s\n", fn, digest);
>  	}
>  	return 1;
>  }
> @@ -119,7 +119,7 @@
>  		if (RMD160File(fn, digest) == NULL) {
>  			return 0;
>  		}
> -		(void) printf("RMD160 (%s) = %s\n", fn, digest);
> +		(void) printf("RIPEMD160(%s)= %s\n", fn, digest);
>  	}
>  	return 1;
>  }
> 
>