tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Extracting versions from pkgsrc tree taking hours - how to address?

> Some questions/thoughts:

> summary information format:
> - If anything does generate summary information for pkgsrc tree is
> there any reason not to use the same pkg_summary format used for
> binary packages

This is not about speading up fetching PKGNAME, but if you need full
summaries, not just PKGNAME, you can use yet another tool from
{pkgtools,wip}/pkg_summary-utils collection -- pkg_src_summary. There is
also pkg_bin_summary built on top of 'pkg_info -XBL'.  Everything in
this package was built on top pkg_summary(5) format.
For me it is a standard for years.

Both pkg_src_summary and pkg_micros_src_summary are able to use several
CPUs or machine in a network (with a help parallel/paexec).

I wrote pkg_micro_src_summary a few years ago and have not touched it
since then.  So, it is time to revise it, probably rewrite in C or
invent a better heuristics. For now "nih status -as" built on top of
pkg_micro_src_summary takes 76 secs real time (files were not cached)
for 1019 installed packages on my Dual core Atom-330 with HT enabled. 48
secs real time for a subsequent run. Four concurrent processes were run
(sysctl hw.ncpu). During this test bulk build was being run.

> generating summary information:
> - Should we have a specific in tree tool for generating pkgsrc summary
> information, which longerterm lintpkgsrc and other tools should be
> depending upon (even if they only use the output directly and there is
> no cache).

pkg_summary-utils package was created for exactly this purpose.  nih and
distbb are based on it.  Extra dependencies: "runawk" which is marginal
but stable and way smaller than perl, tiny package "pipestatus" and

From time to time I find and fix some bugs in this package but in
general I'd say it is rather stable. As always it is full of regression

This is one option, not "in-tree" solution, though.

> speeding up parsing:
> - Would it make sense to adjusting pkgsrc to help quick mechanical
> parsing, for example if we moved to a default of defining PKGNAME and
> deriving DISTNAME from PKGNAME,

Very good idea! We can start with pkglint, directing developers to
simplify PKGNAME. Bad news is that PKGNAME is not enough for bulk
builds. So, it is unlikely that such a trick significantly improves
anything in pbulk or distbb. For making quarter updates more efficiently
I'm improving distbb.

Best regards, Aleksey Cheusov.

Home | Main Index | Thread Index | Old Index