Subject: Re: ATTENTION: avoid using libc.13
To: Greg A. Woods <woods@web.net>
From: Jason Thorpe <thorpej@nas.nasa.gov>
List: current-users
Date: 10/21/1997 00:43:50
On Tue, 21 Oct 97 03:06:35 -0400 (EDT) 
 woods@kuma.web.net (Greg A. Woods) wrote:

 > Excuse me?  Just do not ever revert library version numbers!!!!  Period.

No, really.  Trust me.  Like I said, normally, I would whoheartedly
agree with you.

But this time, I _swear_, we really did save a lot of pain for everyone
by doing this...

 > I don't care if it's only been a few hours and only one supscan has run
 > and only a few dozen people have downloaded it.  Just don't do it.
 > 
 > There's no shame in moving on to the next number, esp. if you've just
 > bumped it anyway.
 > 
 > I.e. what's been done (the N'th time where N is definitely >1 in the
 > history of NetBSD libc I might add) is the *worst* of the possibilities.

No it's not.  If we'd stayed with .13, or bumped it once more to .14,
we would have broken nearly every third party library, and in fact
noticed the problem when a couple of programs _did_ break.

In a nutshell, here was the problem:

	- You have a program that was linked against e.g. libc.so.12
	  and libkrb.so.2 (I believe this was the actual case).

	- You rebuild the world.  This plops a new libc.so.13 down,
	  and rebuilds libkrb.so.2.

	- Now, note that libkrb.so.2 uses stat(2).  It's been built with
	  the new includes.

	- Your program runs, and it binds libc.so.12.  This means you're
	  using the old stat(2), not the new stat(2) that libkrb.so.2 is
	  expecting.

	- Your program goes boom.

Charles, Christos, Frank, and I discussed this extensively.  One thought was
to just bump the major of every library that used the altered interfaces.
We very quickly realized that would be an amazing ball of hair, because of
third-party libraries (they do exist!)

We pondered "Do we just break everything now and hope we don't get letter
bombs".. and that's nearly what we did.

But them someone remembered the trick Solaris did to deal with this very
same problem... cheezy function versioning.  This is what we opted for.

At this point, libc.so.13 was already broken.  We needed the old
implementation to have the old name..

 > Just moving ahead cleanly avoids all the problems even if most people
 > end up never seeing the broken intermediate revision.

Not this time... Perhaps Frank or Christos could explain this a little
better than I can... I'm still a bit frazzled just getting the branch
cut, and all.  (Thanks for the help, Perry!)

Jason R. Thorpe                                       thorpej@nas.nasa.gov
NASA Ames Research Center                            Home: +1 408 866 1912
NAS: M/S 258-6                                       Work: +1 415 604 0935
Moffett Field, CA 94035                             Pager: +1 415 428 6939