Subject: sandbox builds + Re: Removing All Packages
To: D'Arcy J.M. Cain <darcy@NetBSD.org>
From: Douglas Wade Needham <cinnion@ka8zrt.com>
List: tech-pkg
Date: 11/06/2004 09:44:18
	version=3.0.0
Sender: tech-pkg-owner@NetBSD.org

Quoting D'Arcy J.M. Cain (darcy@NetBSD.org):
> On Fri, 29 Oct 2004 14:00:49 +0700
> Ian Zagorskih <ianzag@megasignal.com> wrote:
> > On Friday 29 October 2004 08:53, Daniel Bolgheroni wrote:
> > > I trying to remove _all_ packages from my system, with the minimal
> > > amount of commands possible.
> > >
> > # pkg_delete -r \*
> 
> Make damn sure that you are in the right directory and that you get the
> direction of that slash correct.  Safer is just "rm -rf /usr/pkg
> /var/db/pkg" as someone else suggested.

I always preferred single quotes.  But if you think about it a second,
the pkg_delete will just complain with a bunch of lines like this if
you get it wrong (taken from me actually doing it on a production machine):

    pkg_delete: package '/bin' not installed

And if there is a hit on something, there is no problem, as the goal
is to remove all packages.  And while I am thinking about it, if you
cd into /var/db/pkg and do the pkg_delete, no costly finds or pkg_info
needed. Nor does it clobber the config files if the package was
designed with some intelligence.  8)

Now, I have a question for Ian before I go on with the rest of my
reply...

    Just what exactly is it that you are trying to accomplish by
    deleting all of the packages???

I have one or two suspicions, but I am not entirely sure.

> By the way, once in a while I like to make sure that I have a clean
> installation of pkgsrc but I can't necessarily wipe them all out while I
> rebuild.  KDE and OpenOffice take days just by themselves on some
> machines.  What I do is like above except that I just wipe out the
> database.  The steps I do are;
> 
> 1. rm -rf /var/db/pkg
> 2. Build the packages.  I have a script that builds the ones I want.
> 3. Remove any old files from pkg using find(1)
> 4. rm -rf /var/db/pkg again
> 5. Build packages again in case there are some older files installed by
> tar(1).
> 
> I may lose a few files at step 3 temporarily but mostly I am never
> without a system while I do this.  The only problem I have is that some
> packages fail if the files exist but they think that they don't have to
> worry about it since it is a new installation.  I just handle those
> manually.  One of these days I will add a little post build stuff to
> handle those cases.

Darcy, I have to wonder why are you going through all this
risk/hassle??  I have been using a technique pretty much unchanged for
around a decade now, and IMO it works great.  And it has an added
advantage when doing more than one machine which has the same SW
installation.  The solution is to build things in a sandbox, then use
rdist to push the new stuff into place.  The only real hassle is
developing the list of files to exclude, and double checking that list
when doing a major upgrade.  But I have used it to do upgrades on
machines actively handling traffic, such as my firewall, my NFS server
and my bastion host, and never had a problem, other than the time I
forgot to set net.inet.ip.forwarding to 1 on my firewall. ;)

Details:
    1) If needed/desired, do a build of the OS into a few directories.
       Otherwise, work from previous copies.
    2) Do a build of the desired packages while chrooted into an area
       created using the output from (1).  And union mounts are a
       really great way to get the source trees mounted under this
       area.
    3) Do a verification rdist with compares (e.g. "-ocompare,verify")
       to make sure nothing is unexpectedly updated/removed.  Yes,
       some files will always be updated even if rebuilt from the same
       sources, and sometimes you need to do diffs on files such as
       sendmail.cf to see that it is just the build comment at the top
       which is causing the update, but it is well worth it.
    4) If things are being updated/removed which you do not want, you
       can either edit the distfile, or go back to steps (1) or (2)
       and restart there.
    5) Once you are satisfied with what will be updated or removed, go
       for it and do the rdist in non-verify mode.

As mentioned, this has several advantages:

- It is great when you are updating more than one machine with an
  identical SW load.  I developed this when I was responsible for the
  BSD/OS machines at CompuServe, and having 1200+ hosts with pretty
  much identical SW loads (only the kernel and a few config files
  differed).

- No more cruft files left after upgrades, other than old config files
  or files in places such as /var/tmp.  And you can easily find these
  with a modified distfile with much fewer exclusions which could also
  be used for bootstrap installs.

- It gives you something to work with during disaster recovery.  Loose
  the OS/SW drive on that huge INN or web server?  No problem.  Do a
  minimal install to get networking up, re-establish the identity of
  the host (IP address, SSH key files, etc.) and get the right version
  of rdist on the machine if necessary, then re-push from the
  repository you built.

- With just a little more work, it gives you a great way to verify
  what is on bastion hosts and firewalls.  You could use this to
  replace or supplement tripwire and other tools.

- Live updates...short of the reboot to activate a new kernel, there
  is seldom more than a few moments downtime.  Particularly nice if
  you are tracking -current or the 2.0 release.

- BIG PLUS (from my current PoV)...  doing work on a new package and
  want to clean up after yourself?  Just do an rdist!  I am making use
  of this pretty regularly right now, while doing work to produce
  updated packages for Zope 2 (e.g. zope27 and zope27-*).

- And when you get that package finalized and want to add it to your
  standard install, you can just update your sandbox and push out the
  changes. 

The downsides?  Yes, there are a few, but they can be managed.

- Requires disk space for the source tree and sandbox.  But given very
  good 80GB IDE drives are now under $100, and 120GB drives are just
  over $100...

- You will probably want to have local copies of the source trees.  I
  tried mounting the source trees via NFS about a year ago and
  building -current by union mounting that NFS mount point under my
  sandbox, but I was halfway guarenteed to panic the machine.  Gonna
  have to try this one again, but in the meantime, a local copy
  rdisted from my central copy is the way I handle this.

- Doing an rdist to a large number of machines can eat at your network
  resources and does take time.  This is a larger problem when you are
  dealing with 1200+ machines with several hundred in locations like
  Munich, London and Paris and you are in Columbus, and while the
  rdist utility helps some, but it could be improved.

- Speaking of rdist improvements...If you have one file which you want
  version A on machine group A, version B on machine group B, etc. or
  you want to install packages C, D and E only on another group of
  machines, rdist will give you a little bit of headaches.  If you
  follow the link I mention later and look at my distfile, you will
  see my NOT_PELL and NOT_BETA are an attempt to work around the
  first.  I also handle this with other repositories, one for my FW
  (currently build by hand), and one containing my network config and
  host identities.

- Another improvement rdist could use...doing a diff.  When working
  with hosts other than the one holding your sandbox and you want to
  find out why one file is to be updated, you have to go through the
  hassle of copying it back and doing the diff by hand.

And as a reference point...I have a customer who has not had a real
disaster recovery procedure for their build machines, and uses YP and
a different scheme for allocating UIDs.  They had a new build machine
with fairly minimal OS install (no X11, upon which my step 2 depends,
since I build packages like realplayer).  It took me 3 runs of
build.sh to produce the tweeks to build the base OS (one of which was
due to my forgetting to update a local set list I have added in after
clearing out some of my files from a lsrc build).  Total time so far
actively doing this is less than 3 hours, including producing a list
of packages to build.  Now, I am just waiting on build for step 2.

Anyone interested is this is encouraged to search for some of my
earlier postings, and you may freely browse my build scripts, which
are located in the netbsd_build and lsrc directories under:

    http://www.ka8zrt.com/cgi-bin/cvsweb.cgi/

My only request is that you give me credit if use or pass this
information along.

And if someone is interested in my tweeks to automatically cross-build
and package my lsrc like we do X11, please drop me a note.  I have
offered in the past to contribute these, but there was no apparent
interest.

- Doug

-- 
Douglas Wade Needham - KA8ZRT        UN*X Consultant & UW/BSD kernel programmer
Email:  cinnion @ ka8zrt . com       http://cinnion.ka8zrt.com
Disclaimer: My opinions are my own.  Since I don't want them, why
            should my employer, or anybody else for that matter!