Re: make replace

To: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
Subject: Re: make replace
From: Greg Troxel <gdt%ir.bbn.com@localhost>
Date: Thu, 08 Jul 2010 10:34:44 -0400
Joerg Sonnenberger <joerg%britannica.bec.de@localhost> writes:

> On Mon, Jul 05, 2010 at 03:23:42PM +0200, Dieter Baron wrote:
>>   The fundamental assumption of make replace and rolling replace is,
>> that temporary inconsistencies in the installed package tree are okay:
>> The user knows what he's doing, the problematic parts of the tree are
>> marked and there is a (semi)automated way to repair the tree.
>
> Is it so? I tried finding some discussion on the goals at the time of
> Al's original commit of make replace (rev 1.939 of mk/bsd.pkg.mk in
> March 2002), but I can't find any. So at least for "make replace" I do
> not agree that it is the goal of the target and my reading of the
> description of the target in pkgsrc.txt make it more sound like "it
> should work better".

The original make replace replaced the bits and repointed the
dependencies.  Obviously the intent was to allow a clueful user to
replace a package in-place without removing and reinstalling all
depending packages.

Around 2005 I added the mechanism to record unsafe_depends and had Nick
(also at BBN) write pkg_rolling-replace.

>>   I hope we all agree that creating inconsistencies unknowingly and
>> without warnings (and markings) from the tools should not be allowed
>> by the tools.
>
> Yes.

I think it's quite clear that we don't all agree with what you're trying
to say, and it all depends on how you define things.  So here's what I
agree with, and I think all of this is widely agreed on:

  pkg_* tools, when not given flags indicating permission to deviate
  from the base rules, should maintain all invariants.

  there's a cross-invocation invariant, which is "packages with replaced
  dependencies have the unsafe_depends tag"

  This is Unix.  If the user really wants to override things, there's -f.

What seems to follow to many of us but is in dispute is:

  If -f overrides N checks, it would be better to additionally have
  separate override flags per check, so that people can choose to use
  the smallest hammer that works.  This is true regardless of why any
  particular override is desired.

  The "make replace/unsafe_depends" invariant should be the default
  behavior of "make replace".  pkg_add -U is too strict about exact
  dependencies, and thus the -D flag makes sense.

(BTW, it would be nice to have a design document stating what the
invariants are; we have an operational not-really-definition of what the
tools happen to check.)

>>   So, as the tools are improved to catch more and more cases that lead
>> to inconsistent trees, make replace will run into more and more
>> problems.
>
> Yes.

I don't think that follows.  make replace has always had the behavior
that dependencies of depending packages weren't checked.  The 'problem'
now is that it is, and it's easy to fix.

The cases that make replace doesn't handle are reorganizations of
packages that require removing and installing multiple packages in a
transaction.  This has been true for a long time and I expect it to
remain true.  None of the other checks in pkg_* have been problematic
for make replace

>>   For most packages, replacing them won't cause inconsistencies, even
>> a minor update to a shared library (e.g. glib2 2.20.0 to 2.20.1).
>
> If the minor update doesn't result in ABI incompatible changes (with or
> without major version bump). OpenSSL has a history of doing bad things.

Yes, that's why unsafe_depends is set.  make replace followed by pkg_rr,
when it succeeds, brings you back to a safe state.  No one has ever
argued that individual make replace operations are always ok in
isolation.

> The essential point is that there are safe updates that don't/can't
> break other packages and updates that are not safe. Updates can
> therefore be categories as:
>
> 1. Updates that are safe. Ignoring unexpected functional regressions
> ("bugs"), there is no good reason to assume that the update can't be
> installed without breaking the system.
>
> 2. Updates that may or may not work, depending on whether the dependent
> packages use functionality that was changed in an incompatible way.
>
> 3. Updates that are known to not work, e.g. that are pretty much
> guaranteed to break dependent packages.
>
> Updates using "pkg_add -u" generally fall into category (1) or (2). They
> are not exclusively in category (1) due to the open-ended dependencies
> used for libraries. Cases for (3) that are not possible include
> overlapping file lists due to moving files or major/minor updates of
> PostgreSQL.

True, but even with with the best dependency info there could still be
stealth ABI changes that put something that seems to be in 1 into 2.

> Updates using "make replace" with USE_DESTDIR=no fall into all three
> categories. With USE_DESTDIR=yes, it is equivalent to "pkg_add -u".

True.  I'm not trying to argue for what I think is your category 3.

But what you didn't say is that some updates in category 1/2 succeed
with make replace w/o DESTDIR, but are blocked with DESTDIR=yes.  This
includes some updates in category 1.

> Updates using "make update" (without DEPENDS_TARGET=bin-install) are
> consistently in category (1). The two major issues with "make update"
> from my perspective are:
>
> (a) It creates a large downtime for the common case of safe updates.
>
> (b) It is too easy to loose state in case of build failures or
> interruption. With state I mean the question of what was installed and
> where it should be built from.

Agreed.

> Before looking at the "expected" / "accepted" bugs created by
> "make replace", I would like to stay at this point and discuss the topic
> I couldn't find a reference for. What are the essential properties of an
> update procedure and when are violated.
>
> From the perspective of a system developer and from the perspective of
> an administrator, I consider the most important property to not end up
> in known-to-be-inconsistent state. The second most important property is
> to make updates as timely as possible. It is easy to schedule a small
> down time window for updates, if you know that they won't break things.
> It is a lot harder to do that if the duration is unpredictable. Similar
> issues apply from the end user perspective. "Oh, there is a new
> Mercurial version? Let's try it." Works nice if you know that you can
> still work in the mean time. "Oh, there is a new png version? Let's try
> it." -- not good during a work day if that resulted in a major version
> bump.

What you said is mostly ok, but you are now veering off into the complex
tradeoffs that people make, and I don't think we should be doing that.

> The update process under this assumption can be summarized as:
>
> (1) Be incremental if possible. Incremental updates are generally easier
> to test, faster to build, faster to install.
>
> (2) If incremental updates are not possible, back out as few packages as
> possible to make the update possible and restore the original situation
> as best as possible.
>
> (3) if restoring the original situation is not possible, make it
> possible to revert everything, fill a PR and try again later.
>
> This is not the status quo for pkgsrc at the moment as written already
> and there are few things to work on to make it possible. I'm not sure
> this is implemented very well for update-from-source systems in general,
> especially (2) has a huge span of possible improvements.
>
> Some aspects to consider are:
> (a) The ability to revert updates is relatively easy to implement,
> pkg_tarup is a good starting place.
>
> (b) The major issue with update category (2) is the open ended
> dependencies in buildlink3.mk. The solutions are known, they just have
> to be implemented. Note that this makes the current situation with
> incremental updates worse by moving updates of the major version bump
> style from category (2) to (3).
>
> (c) Subpackages can make (2) less painful on common platforms like ELF
> systems. The idea mentioned by the OpenBSD folks at pkgsrcCon to
> forcefully override the major version created by libtool in combination
> with subpackages can dramatically reduce the impact of major version
> bumps. Basically, next to the PKGREVISION a package could have something
> like MAJOR_VERSION_OFFSET, that gets added to all major versions of
> libraries in that package. This offset is monotonically increasing and
> if shared libs (libfoo.so.*, not libfoo.la and not libfoo.a) are in a
> separate subpackage, it would allow removing the development files etc
> of png, install the new files and all existing packages would still
> work. It doesn't help all cases, but it can make the process a lot less
> painful.
>
> (d) The issue of Perl minor updates could be addressed like Python
> packages are dealt with -- ensuring that the packages are in separate
> subtrees and that the package names don't overlap. The same applies to
> PHP, Apache and maybe other cases I have forgotten (TCL?).
>
>>   Some examples where using 'make replace' will lead to inconsistent
>> trees, as food for discussion:
>> 
>>   1. Packages depending on particular versions of other packages for
>>      good reason which the updated package doesn't fulfill, e.g. perl
>>      modules using a different path in 5.10 than in 5.8, so they
>>      definitely won't work.  (Perl scripts, on the other hand, usually
>>      shouldn't include perl's buildlink3.mk and thus not inherit the
>>      upper bound.)
>
> Mentioned above under (d).

True, but this case is handled by pkg_rr (with a period of nonworking
perl modules).

>>   2. Files moving between packages, e.g. tex-foo being split off from
>>      teTeX; here tex-foo needs to be installed as a dependency of an
>>      updated teTeX package, but would overwrite files of the existing
>>      teTeX package; or header files + libraries moving from gtk2 to
>>      glib2, where a newer glib2 would overwrite files from the old,
>>      still installed gtk2.
>
> Broken for "make replace" in both cases. If the content of the file
> didn't change, "make replace USE_DESTDIR=no" will result in the file
> missing, I think. "make replace USE_DESTDIR=yes" will complain.

Yes, and I'm not asking to change that.

> Idea (c) above can hopefully help to reduce the pain of this case.
>
>>   3. A major version update of a shared library.  This does not cause
>>      an inconsistency yet (just broken dependencies), but will once we
>>      solve the open-ended dependency problem.
>
> The more important problem is that due to the hierachical namespace ELF
> has, this can create very strange bugs that are not directly visible
> from the binaries. Consider libfoo, which links against libbar, and
> re-exports a symbol that has changed in size or type in the last libbar
> update. The program foobar links against libfoo, but not against libbar,
> but uses this symbol. With the way pkg_rolling_replace works, foobar is
> essentially broken until the time it is replaced. At the very least, the
> code path that is using the symbol is. It doesn't downright fail to
> start or anything obvious like that. The NetBSD SA list has a very
> similar example of problems in this area with potential security
> implications. This is a critical edge case for the dependency system
> too.


The grand discussion of how to improve things is quite reasonable and I
agree with most of it.  But I don't see how "I can point to a problem
with a current method" leads to "we should not make a small, easily
understood, incremental fix that improves the way that method works".
Attachment: pgpoOhX5OEzr3.pgp
Description: PGP signature
References:
- make replace
  - From: Dieter Baron
- Re: make replace
  - From: Joerg Sonnenberger
Prev by Date: Re: make replace
Next by Date: Re: make replace
Previous by Thread: Re: make replace
Next by Thread: Re: make replace
Indexes:
Home | Main Index | Thread Index | Old Index