Re: pkgsrc scanning performance benchmarks

To: Pkgsrc NetBSD <pkgsrc-users%netbsd.org@localhost>
Subject: Re: pkgsrc scanning performance benchmarks
From: John Marino <netbsd%marino.st@localhost>
Date: Fri, 2 Dec 2016 07:50:39 -0600

On 12/2/2016 04:12, Joerg Sonnenberger wrote:

On Thu, Dec 01, 2016 at 07:54:27PM -0600, John Marino wrote:

I do not believe the pkgsrc framework is 28 times more complex than the
Ports Collection framework.  It's just much more inefficient.  I know such
statements rankle some pkgsrc devs, but numbers don't lie.


If you compare Apples and Oranges, numbers do lie. It might surprise
you, but it is a well known fact that the tree scanning i.e. as part of
the bulk build is a very time consuming component.

This is a direct Apples-to-Apple comparison. Both trees are given thesame exact task to do. It's on pkgsrc if it uses a poor implementationto do the same task.

There have been hacks proposed in the past to replace the make extraction,
but none of the proposals actually work properly, because they disable
important functional parts. This *is* a case where pkgsrc is actually
significantly more complex than ports.

Architectionally, there are three bigger parts that slow things down as
far as the scan phase is concerned:
(1) Finding the builtins and computing the resulting versions.
(2) Reducing patterns by merging ranges.
(3) Include recursion via buildlink3.mk.

I think there are more mundane causes that are compounded as well. Butfor the sake of argument, all this means is the architecture isfundamentally flawed with regard to performance. This has beendemonstrated by ports, namely achieving the same result but a magnitudefaster.

The first part could be optimized to avoid needless recomputation for a
bulk build, but it is requires figuring how a reliable caching
mechanism and reviewing the side effects of existing builtin.mk files.
AFAIK, Ports doesn't have anything like this or at least only for very
isolated items.

apples-to-apples, right? As you said, FPC has the needed recalc too,but 27000 times (12,000 times more than pkgsrc).

The second part is done with the help of some external scripts because
doing it in make internally is pretty much impossible. A single
monolithic program would be faster than the repeated pkg_admin pmatch
calls, but I don't think the total time spend on this justifies the
cost.

I suspect that pkg_admin (which incidentally severely limits portabilityof pkgformat) is one of the prime culprits. Thus this technicalrequirement of pkgsrc might be unique and cause what I would call anunacceptable performance hit.

anyway the solution would be the opposite -- not require an externalprogram -- because that causes multiple problems in addition to a grandperformance hit.

The last one is far more tricky. The b3.mk includes hit a number of
scalability problems in make; some of them might be fixable in the
implementation, but many are likely unavoidable without actually
introdcuing e.g. lists into the core language. The history of
mk/bsd.fast.prefs.mk should be illuminating. Ports doesn't have this
problem due to the flat dependencies. There have been discussions about
potential ways to improve the situation. One change in the past was to
improve the include guards as found by cube. See mk/termcap.buildlink3.mk
r1.7.

I've see a single change remove minutes from a full tree scan. Thenumbers I've shown here are improved from using a full scan with anothermethod because defines variables in the internal make.conf (mk.conf)predefines several variables to avoid unnecessary spawning (e.g. uname-s). Without those tricks, the scan would be much longer on bothpackage systems.

So yes it's worth making a policy to look at each implementation in theterms of performance and revamp existing one and deny slow proposedones. Even small ones matter.

I think 5x slower for an architecture you want is reasonable, but 28xslower is beyond reasonable.

As I said, I hoped these apples-to-apples number would finally shedlight on this, but step 1 to sanity is to admit you have a problem.


John

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

References:
- pkgsrc scanning performance benchmarks
  - From: John Marino
- Re: pkgsrc scanning performance benchmarks
  - From: Joerg Sonnenberger

Prev by Date: Re: pkgsrc scanning performance benchmarks
Next by Date: anyone care to see synth in action?
Previous by Thread: Re: pkgsrc scanning performance benchmarks
Next by Thread: Re: pkgsrc scanning performance benchmarks
Indexes:

Home | Main Index | Thread Index | Old Index