[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: pkgsrc scanning performance benchmarks
On 12/2/2016 04:12, Joerg Sonnenberger wrote:
On Thu, Dec 01, 2016 at 07:54:27PM -0600, John Marino wrote:
I do not believe the pkgsrc framework is 28 times more complex than the
Ports Collection framework. It's just much more inefficient. I know such
statements rankle some pkgsrc devs, but numbers don't lie.
If you compare Apples and Oranges, numbers do lie. It might surprise
you, but it is a well known fact that the tree scanning i.e. as part of
the bulk build is a very time consuming component.
This is a direct Apples-to-Apple comparison. Both trees are given the
same exact task to do. It's on pkgsrc if it uses a poor implementation
to do the same task.
There have been hacks proposed in the past to replace the make extraction,
but none of the proposals actually work properly, because they disable
important functional parts. This *is* a case where pkgsrc is actually
significantly more complex than ports.
Architectionally, there are three bigger parts that slow things down as
far as the scan phase is concerned:
(1) Finding the builtins and computing the resulting versions.
(2) Reducing patterns by merging ranges.
(3) Include recursion via buildlink3.mk.
I think there are more mundane causes that are compounded as well. But
for the sake of argument, all this means is the architecture is
fundamentally flawed with regard to performance. This has been
demonstrated by ports, namely achieving the same result but a magnitude
The first part could be optimized to avoid needless recomputation for a
bulk build, but it is requires figuring how a reliable caching
mechanism and reviewing the side effects of existing builtin.mk files.
AFAIK, Ports doesn't have anything like this or at least only for very
apples-to-apples, right? As you said, FPC has the needed recalc too,
but 27000 times (12,000 times more than pkgsrc).
The second part is done with the help of some external scripts because
doing it in make internally is pretty much impossible. A single
monolithic program would be faster than the repeated pkg_admin pmatch
calls, but I don't think the total time spend on this justifies the
I suspect that pkg_admin (which incidentally severely limits portability
of pkgformat) is one of the prime culprits. Thus this technical
requirement of pkgsrc might be unique and cause what I would call an
unacceptable performance hit.
anyway the solution would be the opposite -- not require an external
program -- because that causes multiple problems in addition to a grand
The last one is far more tricky. The b3.mk includes hit a number of
scalability problems in make; some of them might be fixable in the
implementation, but many are likely unavoidable without actually
introdcuing e.g. lists into the core language. The history of
mk/bsd.fast.prefs.mk should be illuminating. Ports doesn't have this
problem due to the flat dependencies. There have been discussions about
potential ways to improve the situation. One change in the past was to
improve the include guards as found by cube. See mk/termcap.buildlink3.mk
I've see a single change remove minutes from a full tree scan. The
numbers I've shown here are improved from using a full scan with another
method because defines variables in the internal make.conf (mk.conf)
predefines several variables to avoid unnecessary spawning (e.g. uname
-s). Without those tricks, the scan would be much longer on both
So yes it's worth making a policy to look at each implementation in the
terms of performance and revamp existing one and deny slow proposed
ones. Even small ones matter.
I think 5x slower for an architecture you want is reasonable, but 28x
slower is beyond reasonable.
As I said, I hoped these apples-to-apples number would finally shed
light on this, but step 1 to sanity is to admit you have a problem.
This email has been checked for viruses by Avast antivirus software.
Main Index |
Thread Index |