tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

pkg_admin rebuild-tree optimisation



As you may know I've been working on a large overhaul of pkgin (current status: 2,500x speedup of "pkgin -n upgrade" on my test system[0]).

Part of that work has been to correctly register local dependencies, using a copy of the pkg_admin rebuild-tree code. I overhauled the pkg_admin code a few years ago[1] to significantly improve performance and correctly register +REQUIRED_BY, and as part of my recent work I have found another optimisation.

When looking for reverse dependencies, all installed packages are considered for every match. This can add considerable cost when there are a large number of packages installed. On my test system with 12,663 packages installed:

  $ time pkg_admin rebuild-tree
  real	0m48.345s
  user	0m47.822s
  sys	0m0.455s

Instead of considering all packages for each match, the following diff simply returns the first match for rebuild-tree calls:

  https://gist.github.com/jperkin/a60680d0a60a3db89774cce884f5404a

with the result being a 12x speedup:

  $ time ./pkg_admin rebuild-tree
  real	0m3.788s
  user	0m3.360s
  sys	0m0.420s

What are the implications? Returning the "best" match doesn't really make any sense for local packages, given you cannot have multiple packages installed with the same PKGBASE.

The actual differences on my test system are very few, and only where there are alternate matches. For example, gqmpeg has an alternate match on various mpg123 packages:

  $ pkg_info -qn gqmpeg | grep ^mpg
  mpg123{,-esound,-nas}>=0.59.18

I have both mpg123 and mpg123-nas installed:

  $ pkg_info | egrep '^mpg123(|-nas)-[0-9]'
  mpg123-1.22.4       MPEG layer 1, 2, and 3 audio player
  mpg123-nas-1.22.4   Contains the nas module for mpg123

With the current code, "mpg123-nas" has the +REQUIRED_BY entry for gqmpeg, while in the patched version "mpg123" has it.

In reality both cases are wrong. This is an OR match, and it should be possible to delete either package, as long as the other package is still available to satisfy the DEPENDS. However we do not have a way to express that logic with the current pkgdb.

The only other differences are due to these matches:

{a2ps,enscript,mpage}-[0-9]* libao-[a-z]*-[0-9]*

where the +REQUIRED_BY entries move from/to:

  a2ps-4.14nb8 -> mpage-2.5.6
  libao-pulse-1.2.0nb4 -> libao-nas-1.2.0

Here, while the older code does pick out the higher versions, again that doesn't make it correct, and in practical terms I don't believe it makes any difference. There's no consideration of automatic packages, for example.

Reviews appreciated, especially if you believe my analysis is incorrect, otherwise I'll commit soon.

Thanks,

[0]: https://federate.me.uk/@jperkin/110827538411579571
[1]: https://gist.github.com/jperkin/98550d5bd07f4179ebfeea825fc3ec20

--
Jonathan Perkin   -   mnx.io   -   pkgsrc.smartos.org
Open Source Complete Cloud   www.tritondatacenter.com


Home | Main Index | Thread Index | Old Index