Subject: package dependancy graph analysis..
To: None <tech-pkg@netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-pkg
Date: 01/11/2001 10:45:19
So, I had an idea on how to automatically distribute packages across
CD's of a multi-volume set.  I cooked up a perl script (available on
request) to find connected subgraphs of packages within the dependancy
graph; Hubert sent me dependancy and size data for 1668 packages.

It turns out that there are 607 distinct subgraphs.

Unfortunately, the largest subgraph contains 956 packages (!!); aside
from this hairball, there are a couple smaller blobs (size 11, 16, 31,
9); the rest are mostly ones and twos; packages in this subgraph total
827598k, which is more than will fit on a single CD..

Fortunately, there are a few very common dependancies:

perl-base-5.6.0.tgz: size 4942 required by 261
p5-Data-Dumper-2.101.tgz: size 30 required by 257
p5-Devel-Peek-1.0001.tgz: size 12 required by 257
p5-Devel-DProf-19990108.tgz: size 22 required by 257
p5-CGI-2.74.tgz: size 216 required by 257
perl-5.6.0nb3.tgz: size 2 required by 256
gettext-lib-0.10.35nb1.tgz: size 18 required by 249
png-1.0.8.tgz: size 198 required by 228
xpm-3.4k.tgz: size 76 required by 221
jpeg-6b.tgz: size 194 required by 206
tiff-3.5.5.tgz: size 740 required by 158
pth-1.3.7.tgz: size 266 required by 154
glib-1.2.8.tgz: size 166 required by 148
gtk+-1.2.8.tgz: size 1478 required by 141

Trimming those out (i.e., equivalent to putting them on all volumes)
chips a few things off the hairball, reducing it to 719 packages
totalling 742036 1k blocks (for a cost of 8360k of duplicated
packages).

					- Bill