Subject: Re: Few thoughts about pkgsrc and low disk space computers
To: Michal Pasternak <michal@pasternak.w.lub.pl>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-pkg
Date: 08/15/2003 21:08:32
    Date:        Fri, 15 Aug 2003 02:55:39 +0200
    From:        Michal Pasternak <michal@pasternak.w.lub.pl>
    Message-ID:  <20030815005539.GA26411@pasternak.w.lub.pl>

  | 2. Why the dependencies are checked _after_ the package has been extr=
acted?
  |    Wouldn't it be a bit better to first check if the dependencies are=

  |    available, if not - build them first, then clean them, then extrac=
t the
  |    ,,main'' package and return to it's build?

I understand your problem, but I'd actually prefer it if pkgsrc
moved exactly the opposite direction, and became much more
discerning about what dependencies are actually needed for,
and built them only if really needed for what is about to be done.

Maybe I'm unusual, but the request you've made is exactly what I
don't want - I want to be able to extract a package, and look at it,
perhaps "make patch" as well, without necessarily ever having any
intention to build it (it needs to be extracted to get to see the
doc in most cases, and to look at the code and see if it does what is
needed, and looks to be reasonably well written, the DESCR file can
only go so far, to ask more would be unreasonable).

I certainly don't want the system (pkgsrc) racing about building and
installing lots of dependencies for something that I end up never botheri=
ng
to actually install, but for which I did do "make extract".

At the minute we have just DEPENDS and BUILD_DEPENDS - the latter for
stuff needed to get the package to the state of being installed, and
the former for everything that needs to remain for the package to
actually work.

BUILD_DEPENDS are what Quentin Garnier was referring to, I think, when
he mentioned dependencies being built before the extraction, which is
sometimes necessary, as in his example, for the extract to happen at all.=

In that particular case, obviously what there really was was an
EXTRACT_DEPENDS rather than a BUILD_DEPENDS, but pkgsrc doesn't have that=

(yet...).

There are a couple (well, at least one) package, I don't recall which,
which needs perl to be installed in order to "make checksum" (and in
an environment where the distfile already exists, so nothing needs to
be fetched).   That's absurd...   But perl is probably required to actual=
ly
build the package, and then not afterwards, so it must be a BUILD_DEPENDS=
,
and because BUILD_DEPENDS currently covers what should be EXTRACT_DEPENDS=

and FETCH_DEPENDS, then perl obviously has to be built, for no useful
purpose, just so digest can check the checksum and confirm that all is OK=
=2E
(I do "make checksum" of everything almost daily - building perl makes
that take considerably longer!   The install fails anyway, as none of thi=
s
is being done as root.

I have lots of other cruft than perl built for just this reason (I
managed to wipe out my /var/db/pkg so pkgsrc thinks nothing is installed,=

so goes and does stuff like this - that's fine, if there was a point).
As long as I don't clean up the work dirs, there is no real issue,
but every time I discard all that trash, the next "make checksum" takes
hours longer than it should.

I'd like to see pkgsrc grow to have

FETCH_DEPENDS		for stuff like urlget wget ... that are needed to
			fetch the distfiles, but which would be built only
			if the distfile doesn't already exist

EXTRACT_DEPENDS		for stuff like unzip, bzip2, ... that is needed to
			unpack the distfile that gets fetched

CONFIG_DEPENDS		for stuff like autoconf, automake, ... in the cases
			that they are actually needed to configure the
			package

BUILD_DEPENDS		for things that are needed to get the package from
			configured to compiled (gmake, perl - I suppose,
			m4 perhaps, exotic lang compiler of choice...)

INSTALL_DEPENDS		for things which are needed to get the package
			installed (not sure there are any of those on
			NetBSD at the minute, but other systems might)

RUN_DEPENDS		for those things that need to remain installed for
			the package to be able to usefully do any work.

In the (fairly common) case where something is both a BUILD_DEPENDS and
a RUN_DEPENDS, it should be listed in both - that is common as usually
header files related to a library are needed for the build to be able
to occur, as is the library itself, and the library then needs to remain
so the program can actually execute.   Only RUN_DEPENDS goes in a binary
package depends list.

(There could also be a CKSUM_DEPENDS, but as the only thing it will ever
have in it is digest, that seems like it would be pushing a bit too far).=


Nothing should ever get built unless it is really needed (or is likely to=

be needed - I don't think we need go as far as separate FETCH_DEPENDS for=

each distfile for example - if all distfiles exist, no FETCH_DEPENDS
need to be installed, if any doesn't exist, then go ahead and install
the FETCH_DEPENDS anyway, even if the particular missing distfile doesn't=

require it - this is such a rare case not to be worth worrying about).

That is, defer all depends installations until the last possible second.
Do nothing unless it is needed.

kre