Re: make replace

To: tech-pkg%NetBSD.org@localhost
Subject: Re: make replace
From: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
Date: Tue, 6 Jul 2010 02:20:57 +0200
On Mon, Jul 05, 2010 at 03:23:42PM +0200, Dieter Baron wrote:
>   The fundamental assumption of make replace and rolling replace is,
> that temporary inconsistencies in the installed package tree are okay:
> The user knows what he's doing, the problematic parts of the tree are
> marked and there is a (semi)automated way to repair the tree.

Is it so? I tried finding some discussion on the goals at the time of
Al's original commit of make replace (rev 1.939 of mk/bsd.pkg.mk in
March 2002), but I can't find any. So at least for "make replace" I do
not agree that it is the goal of the target and my reading of the
description of the target in pkgsrc.txt make it more sound like "it
should work better".

>   I hope we all agree that creating inconsistencies unknowingly and
> without warnings (and markings) from the tools should not be allowed
> by the tools.

Yes.

>   So, as the tools are improved to catch more and more cases that lead
> to inconsistent trees, make replace will run into more and more
> problems.

Yes.

>   For most packages, replacing them won't cause inconsistencies, even
> a minor update to a shared library (e.g. glib2 2.20.0 to 2.20.1).

If the minor update doesn't result in ABI incompatible changes (with or
without major version bump). OpenSSL has a history of doing bad things.

The essential point is that there are safe updates that don't/can't
break other packages and updates that are not safe. Updates can
therefore be categories as:

1. Updates that are safe. Ignoring unexpected functional regressions
("bugs"), there is no good reason to assume that the update can't be
installed without breaking the system.

2. Updates that may or may not work, depending on whether the dependent
packages use functionality that was changed in an incompatible way.

3. Updates that are known to not work, e.g. that are pretty much
guaranteed to break dependent packages.

Updates using "pkg_add -u" generally fall into category (1) or (2). They
are not exclusively in category (1) due to the open-ended dependencies
used for libraries. Cases for (3) that are not possible include
overlapping file lists due to moving files or major/minor updates of
PostgreSQL.

Updates using "make replace" with USE_DESTDIR=no fall into all three
categories. With USE_DESTDIR=yes, it is equivalent to "pkg_add -u".

Updates using "make update" (without DEPENDS_TARGET=bin-install) are
consistently in category (1). The two major issues with "make update"
from my perspective are:

(a) It creates a large downtime for the common case of safe updates.

(b) It is too easy to loose state in case of build failures or
interruption. With state I mean the question of what was installed and
where it should be built from.

Before looking at the "expected" / "accepted" bugs created by
"make replace", I would like to stay at this point and discuss the topic
I couldn't find a reference for. What are the essential properties of an
update procedure and when are violated.

From the perspective of a system developer and from the perspective of
an administrator, I consider the most important property to not end up
in known-to-be-inconsistent state. The second most important property is
to make updates as timely as possible. It is easy to schedule a small
down time window for updates, if you know that they won't break things.
It is a lot harder to do that if the duration is unpredictable. Similar
issues apply from the end user perspective. "Oh, there is a new
Mercurial version? Let's try it." Works nice if you know that you can
still work in the mean time. "Oh, there is a new png version? Let's try
it." -- not good during a work day if that resulted in a major version
bump.

The update process under this assumption can be summarized as:

(1) Be incremental if possible. Incremental updates are generally easier
to test, faster to build, faster to install.

(2) If incremental updates are not possible, back out as few packages as
possible to make the update possible and restore the original situation
as best as possible.

(3) if restoring the original situation is not possible, make it
possible to revert everything, fill a PR and try again later.

This is not the status quo for pkgsrc at the moment as written already
and there are few things to work on to make it possible. I'm not sure
this is implemented very well for update-from-source systems in general,
especially (2) has a huge span of possible improvements.

Some aspects to consider are:
(a) The ability to revert updates is relatively easy to implement,
pkg_tarup is a good starting place.

(b) The major issue with update category (2) is the open ended
dependencies in buildlink3.mk. The solutions are known, they just have
to be implemented. Note that this makes the current situation with
incremental updates worse by moving updates of the major version bump
style from category (2) to (3).

(c) Subpackages can make (2) less painful on common platforms like ELF
systems. The idea mentioned by the OpenBSD folks at pkgsrcCon to
forcefully override the major version created by libtool in combination
with subpackages can dramatically reduce the impact of major version
bumps. Basically, next to the PKGREVISION a package could have something
like MAJOR_VERSION_OFFSET, that gets added to all major versions of
libraries in that package. This offset is monotonically increasing and
if shared libs (libfoo.so.*, not libfoo.la and not libfoo.a) are in a
separate subpackage, it would allow removing the development files etc
of png, install the new files and all existing packages would still
work. It doesn't help all cases, but it can make the process a lot less
painful.

(d) The issue of Perl minor updates could be addressed like Python
packages are dealt with -- ensuring that the packages are in separate
subtrees and that the package names don't overlap. The same applies to
PHP, Apache and maybe other cases I have forgotten (TCL?).

>   Some examples where using 'make replace' will lead to inconsistent
> trees, as food for discussion:
> 
>   1. Packages depending on particular versions of other packages for
>      good reason which the updated package doesn't fulfill, e.g. perl
>      modules using a different path in 5.10 than in 5.8, so they
>      definitely won't work.  (Perl scripts, on the other hand, usually
>      shouldn't include perl's buildlink3.mk and thus not inherit the
>      upper bound.)

Mentioned above under (d).

>   2. Files moving between packages, e.g. tex-foo being split off from
>      teTeX; here tex-foo needs to be installed as a dependency of an
>      updated teTeX package, but would overwrite files of the existing
>      teTeX package; or header files + libraries moving from gtk2 to
>      glib2, where a newer glib2 would overwrite files from the old,
>      still installed gtk2.

Broken for "make replace" in both cases. If the content of the file
didn't change, "make replace USE_DESTDIR=no" will result in the file
missing, I think. "make replace USE_DESTDIR=yes" will complain.

Idea (c) above can hopefully help to reduce the pain of this case.

>   3. A major version update of a shared library.  This does not cause
>      an inconsistency yet (just broken dependencies), but will once we
>      solve the open-ended dependency problem.

The more important problem is that due to the hierachical namespace ELF
has, this can create very strange bugs that are not directly visible
from the binaries. Consider libfoo, which links against libbar, and
re-exports a symbol that has changed in size or type in the last libbar
update. The program foobar links against libfoo, but not against libbar,
but uses this symbol. With the way pkg_rolling_replace works, foobar is
essentially broken until the time it is replaced. At the very least, the
code path that is using the symbol is. It doesn't downright fail to
start or anything obvious like that. The NetBSD SA list has a very
similar example of problems in this area with potential security
implications. This is a critical edge case for the dependency system
too.

Joerg
Follow-Ups:
- Re: make replace
  - From: Greg Troxel
- Re: make replace
  - From: Alistair Crooks
- Re: make replace
  - From: Marc Espie
- Re: make replace
  - From: Jens Rehsack
References:
- make replace
  - From: Dieter Baron
Prev by Date: Re: Hackathon, July 30 -- August 2
Next by Date: setgid games patch, step 1
Previous by Thread: Re: make replace
Next by Thread: Re: make replace
Indexes:
Home | Main Index | Thread Index | Old Index