tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Rewrite mk/checksum/checksum in awk



* On 2020-08-25 at 07:18 BST, Roland Illig wrote:

> On 21.08.2020 19:41, Jonathan Perkin wrote:
> > Hey all,
> >
> > While working on an upgrade to www/grafana, I needed to use the new
> > go-modules.mk support.  This resulted in over 600 distfiles to verify,
> > and the current shell script cannot keep up, with a quadratic runtime
> > of over 8 minutes for 'bmake checksum'.
> >
> > I took the opportunity to rewrite the script in awk:
> >
> >   https://github.com/joyent/pkgsrc/commit/a040b8962acbab10bdb8bd94eb80129b5dc6279f
> >
> > and now it takes less than 4 seconds.
> >
> > I'm still working through testing this, aiming to support all of the
> > perculiar features of the existing script (who knew "IGNORE" was a
> > supported checksum?) and retain 100% compatibility, but figured I'd
> > throw it out to others for extra testing and review in the meantime.

Thanks for the review.  I've pushed an updated version here:

  https://github.com/joyent/pkgsrc/blob/a3f96d12d7229e021fb174f19b3c1ca7b1f5917d/mk/checksum/checksum.awk

This one passes all the tests I've written (see below).

> I reviewed both programs, and I like it that you replaced "exit 128"
> with "exit 1", since the former could easily translate into an "exit 0".

I've actually fixed that, there were some 128's in my version as I was
trying to retain compatibility, but you're right regarding possible
issues, and so I've now changed them to be 3 to conform to the
expected behaviour of >2 being used for any errors not related to
checksums.

This is the only difference in behaviour that I'm now aware of.

> What you call "a_flag" is not a flag, but the algorithm, so it should be
> called that.  Oh, I see, that would conflict with the "for (algorithm
> in", but I'm sure that can be resolved.

I renamed it to only_alg, which I don't really like either, but I
can't think of a better variable name that is also short and
adequately describes what it is for.

> When you "sub(suffix", you treat the suffix as a regular expression.
> Before, it was treated as a shell pattern. This might make a difference
> for ".tar.gz" and similar suffixes.

This shouldn't be a problem, the only place this is ever used is for
".pkgsrc.resume" in mk/fetch/fetch, but I implemented strip_suffix()
which should do the right thing now anyway.

> In some of the "print self" lines the redirection to stderr is missing.

Fixed, and tidied.

> If you want to write some automatic tests for it, have a look at
> regress/infra-unittests/subst.sh or help.sh.  The test framework used
> there is documented in test.subr.  Using this testing framework, it is
> quite easy to ensure that both programs behave the same.

So I did look to do this, but quickly got frustrated with pkg_regress
(no manual page, doesn't even run with -h if the hardcoded PKGSRCDIR
does not exist, etc.) so gave up and wrote a stupid shell script
instead, which I've attached.

If anyone wants to convert that to whatever the test format du jour is
(I don't really understand regress/) then please feel free, just don't
judge me for the terrible quality of a script that was never meant for
public display ;-)

Thanks,

-- 
Jonathan Perkin  -  Joyent, Inc.  -  www.joyent.com

Attachment: checksum-test.sh
Description: Bourne shell script



Home | Main Index | Thread Index | Old Index