Subject: Beware of stale man pages... :)
To: None <current-users@NetBSD.ORG>
From: Brian C. Grayson <bgrayson@ece.utexas.edu>
List: current-users
Date: 09/18/1997 10:47:40
  We've been tracking -current from at least August of 1994.
Unfortunately, many man pages have been shifted around.  For
example, dump's man page used to go into cat1, and now it goes
into cat8.  I don't believe there is a good way of detecting
such ``stale'' man pages left over from earlier versions of the
OS.  Such stale man pages can lead to user confusion, and also
waste some disk space.

  (man -a allows all matching man pages to be shown, but
unfortunately cat1 is before cat8 in the default /etc/man.conf
search path, so the stale ``dump'' man page shows up first.)

  Since we have installed some third-party software that shoved
its man pages into cat*, we can't just do a 'find \! -newer
<file from last build>'.  For us, I narrowed our dates down into
two ranges:  those before a bunch of third-party installs, and
those after those installs but before our last make-install of
NetBSD.  Doing a find operation on man pages in /usr/share/man
in these two ranges showed 103 stale NetBSD man pages,
taking up nearly 700KB of space.  My script is included at the
bottom of this message, to perhaps make other people's lives a
smidgen easier.

  Other people who have been tracking -current (or even just
running NetBSD) for several years without occasional clean
wipes of their filesystems might want to see if they've got stale
man pages, also.

  Does anyone have a more elegant solution to this (besides
``never install third-party man pages into
/usr/share/man/cat[1-9], and use find ! -newer <latest build date>'')?

  Brian
-- 
Brian Grayson (bgrayson@ece.utexas.edu)
Graduate Student, Electrical and Computer Engineering
The University of Texas at Austin
Office:  ENS 406       (512) 471-8011
Finger bgrayson@orac.ece.utexas.edu for PGP key.


Sample output:

10:37am:139 % mandup > /tmp/remove.me.list
Total stale man pages:     103 
Total disk space consumed by these man pages, in bytes:  695615 

10:40am:140 % head /tmp/remove.me.list
/usr/share/man/cat2/swapon.0
/usr/share/man/cat3/string_to_key.0
/usr/share/man/cat3/random_key.0
/usr/share/man/cat3/ecb_encrypt.0
/usr/share/man/cat3/cbc_encrypt.0
...


========  mandup:  find all ``stale'' man pages between file timestamps
#!/bin/sh
##  Yes, it's an ugly shell script, but it gets the job done.
##  Originally written by Brian Grayson (bgrayson@ece.utexas.edu)
tmpfile=/tmp/`basename $0`.$$
datefile=/usr/share/man/cat8/mk-amd-map.0
olddatefile=/usr/share/man/cat4/alpha
find /usr/share/man -type f \
  \( \! -newer $datefile \) -and -newer $olddatefile > $tmpfile

##  Also look for stuff older than our third-party man pages:
datefile=/usr/share/man/man1/qalter.1
olddatefile=/usr/share/man/cat3f
find /usr/share/man -type f \
  \( \! -newer $datefile \) -and -newer $olddatefile >> $tmpfile

##  Print out all the stale man pages:
cat $tmpfile

##  By doing the word-count to stderr, we can redirect the
## filenames output to a file (for later rm's or wc's)
echo -n "Total stale man pages:" 1>&2
cat $tmpfile | wc -l 1>&2
echo -n "Total disk space consumed by these man pages, in bytes:" 1>&2
cat `cat $tmpfile` | wc -c 1>&2
rm $tmpfile