Subject: Re: When DEPENDS can be upgraded in place
To: NetBSD Packages Technical Discussion List <tech-pkg@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: tech-pkg
Date: 09/08/2000 23:18:27
[ On Friday, September 8, 2000 at 13:09:36 (-0500), Frederick Bruckman wrote: ]
> Subject: Re: When DEPENDS can be upgraded in place
>
> You're not seriously suggesting that we modify every package to bump
> it's major version number, every time the author bumps the minor????  

Well, actually strictly speaking I'm proposing that the major number be
bumped only every time a "significant" change is made in a package that
would cause a different shared library binary to be produced.  It's
irrelevant what the author uses for version identifiers, and indeed
irrelevant even if there's any new code from the author since even a
locally applied patch would cause a different library binary.

If you want new versions of dependent packages to use the new library,
and if you want to have multiple versions of such libraries installed
simultaneously then you have to at least change the minor number of the
library.

If for any reason there's some possibilitly that *any* dependent package
might not want to use the new library then because of the way the shared
library search order works you're stuck with bumping the major number
instead (i.e. so as to avoid having to upgrade all dependent packages at
one time).

Then of course there's still the software hygiene argument which claims
that even if a new minor revision of a shared library fixes a bug in one
binary that uses it you still shouldn't be upgrading such a library
without first QAing it against all possible consumers.  Of course if you
don't want to use any un-tested library then you simply can't be blindly
allowing programs to search for the firse ">= minor#" match either.
I.e. strictly speaking from a software hygiene point of view a shared
library must have only one unified version identifier and the
application must find an exact match of that entire identifier before
using that library (at least without printing stern warnings).  In
pkgsrc this means bumping the major number of every shared library for
every change in the package that installs it.

With very few exceptions third-party shared C (and most C++) libraries
are usually about as far as one can get from the true black-box
implementations of an object-oriented approach using a contract API
where the developer never strays from the API, not even to work around a
bug.  Even pure additions to a shared library must sometimes be treated
as API changes, particulary if the new code calls upon some other shared
library.

When you bring into the picture impure shared libraries -- i.e. those
that depend on other third-party shared libraries, the potential
complications are enormous.

Anyone who has tried to build a new pkgsrc module for something that
requires a newer version of a commonly used library (eg. gd), but who
also already has a relatively large set of packages installed, can
quickly attest to the unworkable state of the current system.  It is
paramount that multiple versions of run-time components be
simultaneously installed.  If it were not for the almost universal
deployment of essentially single-user development systems it would also
be necessary to accomodate simultaneous existance of multiple versions
of development components too.

> I don't where to begin, to tell you what's wrong with that plan. This
> tops your previous suggestion, to abolish shared binaries altogether.

Well I wasn't exactly serious about abolishing shared libraries (I don't
remember suggesting, at least not in the context of packages, that
dynamic linking on a whole was bad -- just that it was/is insane to
instist upon it in every case of an add-on libary).

> Please keep the goal of the discussion in mind. We want the package
> system to accomodate upgrading shared libraries without upgrading the
> binaries that depend on them, in the _same_ way that you can do that
> WITHOUT A PACKAGE SYSTEM.

You just can't do that with impunity in third-party packages, at least
not those that don't imlement a very standard and well known API.

In any case I'm not convinced that's a good thing to allow in the first
place.  Even bug fixes can introduce more bugs than they fix should the
users of the broken version have known of the bug and attempted to work
around it in some way that the fix causes to fail.  The problem is that
none of the tools we commonly use can detect these kinds of things
before or at build time.  Even half-baked QA testing won't always catch
them.  With the number of packages, and the enormous number of
combinations and permutations possible amongst dependent packages, even
half-baked QA testing is simply not humanly possible.

In fact the ability to upgrade shared libraries without upgrading all
the binaries that depend upon them is exactly what I'm trying to work
towards -- I'm just not in any way believing that I can successfully
trick already installed packages into using the new versions of the
library.  I want exactly the opposite:  I want to install new shared
library packages without having any already installed package even
notice them, but I do not want those already installed packages to fail
in any way either.  If I want a package to use a new version of the
library then I want to be forced to rebuild/re-install that package
regardless of whether the library it uses is shared or not.  This is the
only safe and practical way to deal with upgrading third-party code.

I believe based on my experience that the proposals I'm making are
workable solutions to the issues faced by people (such as myself) who
try to support production systems using binary packages.  In such
scenarios it is absolutely critical to be able to install a new version
of some package without disturbing in any way any other package that
uses only build-time dependencies on other packages.  Shared libraries
are only the tip of this iceburg.  We cannot tolerate the insanity that
regularly happens in those systems which use DLLs.

Please also consider the fact that not all binary packages can be
expected to have been compiled by the same person on the same
development system.  In the real world users may retrieve binary
packages from many diverse sources -- not even the "release" of pkgsrc
can be expected to be even close.  They, and any of the packages they
depend upon, must behave as integrated stand-alone units.  If you want
to have the non-change-related benefits of shared libraries with such
packages you must still ensure that any change in a shared library
package creates a set of runtime components that will not clash with
other versions of the same components.

> When I update NetBSD base, or xsrc, I
> install shared libraries on top of other shared libraries, sometimes
> with minor version bumps, and all my previously installed binaries
> just work.

Well, lucky you -- that's not something that just magically happens
though.  When it does work then either there's "Pure Luck(tm)" involved
or the creator of the change has spent considerable engineering effort
in ensuring that the fix is only a fix and will not adversely change the
behaviour of the library or any of its consumers in any way.

Note also you're talking about system libraries -- I'm talking about
pkgsrc libraries.  Pkgsrc maintainers are rarely as intimate with the
innards of the libraries *and* the innards of those third-party packages
which use them as system developers are with their code.  Even the
authors of third-party libraries rarely have deep enough knowledge of
the applications that might call upon their code to understand the full
implications of what might seem an innocuous change.

> Keeping the old library around, or worse, forcing the
> binary to use the old library, trivially satisfies the requirement,
> but fails to address it's purpose.

As I've said I am willing to concede that an OS vendor (eg. NetBSD)
might want to put considerable effort into creating and testing shared
library fixes where it is desirable to ship binary patches without also
shipping copies of all of the binaries that rely upon those shared
libraries.

Furthermore OS libraries are typically implementations of well
documented, well understood, APIs.  Fixes will not generally upset any
application that is implemented with the documented behaviour and API in
mind, especially not for portable programs that might first have been
developed and tested against an implementation that did not exhibit the
bug in question.  However for an increasing number of third-party shared
libraries (eg. gd), this is absolutely not the case!

Finally just exactly when was the last time NetBSD (or any of its
commercial users creating derrivative products) shipped a fix via a
binary shared library alone?  This is a serious question, and not
intended as a flippant brush-off.  Your argument is apparently based on
the idea that this is a primary requirement yet I've not yet ever seen
even one example of it outside of one or two quick hacks I myself have
done with system libraries but which am not even proud enough of to
describe in any detail.

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>      <robohack!woods>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>