* On 2020-08-25 at 07:18 BST, Roland Illig wrote: > On 21.08.2020 19:41, Jonathan Perkin wrote: > > Hey all, > > > > While working on an upgrade to www/grafana, I needed to use the new > > go-modules.mk support. This resulted in over 600 distfiles to verify, > > and the current shell script cannot keep up, with a quadratic runtime > > of over 8 minutes for 'bmake checksum'. > > > > I took the opportunity to rewrite the script in awk: > > > > https://github.com/joyent/pkgsrc/commit/a040b8962acbab10bdb8bd94eb80129b5dc6279f > > > > and now it takes less than 4 seconds. > > > > I'm still working through testing this, aiming to support all of the > > perculiar features of the existing script (who knew "IGNORE" was a > > supported checksum?) and retain 100% compatibility, but figured I'd > > throw it out to others for extra testing and review in the meantime. Thanks for the review. I've pushed an updated version here: https://github.com/joyent/pkgsrc/blob/a3f96d12d7229e021fb174f19b3c1ca7b1f5917d/mk/checksum/checksum.awk This one passes all the tests I've written (see below). > I reviewed both programs, and I like it that you replaced "exit 128" > with "exit 1", since the former could easily translate into an "exit 0". I've actually fixed that, there were some 128's in my version as I was trying to retain compatibility, but you're right regarding possible issues, and so I've now changed them to be 3 to conform to the expected behaviour of >2 being used for any errors not related to checksums. This is the only difference in behaviour that I'm now aware of. > What you call "a_flag" is not a flag, but the algorithm, so it should be > called that. Oh, I see, that would conflict with the "for (algorithm > in", but I'm sure that can be resolved. I renamed it to only_alg, which I don't really like either, but I can't think of a better variable name that is also short and adequately describes what it is for. > When you "sub(suffix", you treat the suffix as a regular expression. > Before, it was treated as a shell pattern. This might make a difference > for ".tar.gz" and similar suffixes. This shouldn't be a problem, the only place this is ever used is for ".pkgsrc.resume" in mk/fetch/fetch, but I implemented strip_suffix() which should do the right thing now anyway. > In some of the "print self" lines the redirection to stderr is missing. Fixed, and tidied. > If you want to write some automatic tests for it, have a look at > regress/infra-unittests/subst.sh or help.sh. The test framework used > there is documented in test.subr. Using this testing framework, it is > quite easy to ensure that both programs behave the same. So I did look to do this, but quickly got frustrated with pkg_regress (no manual page, doesn't even run with -h if the hardcoded PKGSRCDIR does not exist, etc.) so gave up and wrote a stupid shell script instead, which I've attached. If anyone wants to convert that to whatever the test format du jour is (I don't really understand regress/) then please feel free, just don't judge me for the terrible quality of a script that was never meant for public display ;-) Thanks, -- Jonathan Perkin - Joyent, Inc. - www.joyent.com
Attachment:
checksum-test.sh
Description: Bourne shell script