Subject: bin/4103: Various dump(8) oddities
To: None <gnats-bugs@gnats.netbsd.org>
From: None <bgrayson@ece.utexas.edu>
List: netbsd-bugs
Date: 09/10/1997 12:39:21
>Number:         4103
>Category:       bin
>Synopsis:       Several things about dump are wrong
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people (Utility Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Sep 10 10:50:01 1997
>Last-Modified:
>Originator:     Brian Grayson
>Organization:
	Parallel and Distributed Systems
	Electrical and Computer Engineering
	The University of Texas at Austin
>Release:        current from July 97
>Environment:
NetBSD marvin 1.2G NetBSD 1.2G (MARVIN) #4: Fri Sep  5 23:03:56 CDT 1997     root@c3p0:/a/c3p0/home/c3p0/src/sys/arch/i386/compile/MARVIN i386


>Description:
	Several things about dump seem funny or wrong (to me)
	(patches included at end of message):

	1.  On the dump output, it prints lines like:
	  DUMP: DUMP: 774 tape blocks on 1 volume
        instead of
	  DUMP: 774 tape blocks on 1 volume

        2.  The blocksperfile (-B parameter) variable is not
	initialized to 0, and is later used in if(blocksperfile)
	tests.  I guess this is safe, but it's not good style,
	is it?

	3.  The -B and -b flags don't work as documented:  B is
	supposed to specify the number of dump records per
	volume, and b is supposed to specify the size (in K) of
	each dump record.  Thus, if b is 10 and B is 10, tape
	capacity is 100(K).  If we use the variable names b and
	B for these values, the current tape estimate code uses
	the formula:
	  tapes needed =  tapesize(which is the number of 1K
				  blocks needed for the dump) / B
	rather than the correct
	  tapes needed = tapesize / (B * b)

	(Should the variable tapesize be renamed to totaldumpsize,
	or requiredtapesize or something?  At first inspection,
	I thought it meant the actual tapesize, and not ``what
	tape size would be needed for this entire dump to fit
	on one tape''.)

	Thus, if one tries to back up a 100K file on a tape (or
	file) via -B 10 -b 10, it will (erroneously) say it
	needs 10 tapes instead of 1 tape.

	(B*b is dump records per vol * blocks per dump record =
	blocks per vol)

	4.  The variable name blocksperfile is misleading -- it's
	really recordsperfile.  We should either rename it, or
	add a new variable recordsperfile that gets set by -B, and set
	blocksperfile based on recordsperfile*ntrec (after all
	option handling, so that if ntrec is changed by -b, such change
	is noticed).

	5.  This is really just a missing feature.  At times,
	it would be nice to see how many tapes would be needed
	to dump a particular file system without performing the
	actual dump (to tape or to /dev/null), sort of like
	the -N or -n flag for most programs.  Unfortunately, -n
	for dump is already taken, so I chose -e for estimate.

	6.  In the code, the programmer (lukem?) went to a bit of
	trouble making sure that the spcl.c_filesys string was
	set to something nice in the case of subdir dumps (rather than
	full FS dumps), but this string is never used anywhere.
	It looks to me like it should be used in the "Dumping
	/dev/xxx (/mountpoint)" message, so the patch at the
	end of this message makes this change.  This avoids the
	current misleading output while dumping /etc:
  DUMP: Dumping sub files/directories from /
  DUMP: Dumping file/directory /etc
...
  DUMP: Dumping /dev/rwd0a (/) to /dev/null
                ^---        ^--- these make it look like a full dump.

        With the patch, this last line becomes:
  DUMP: Dumping a subset of /dev/rwd0a (a subset of /) to /dev/null



	Whew!  That will keep someone busy for a little while...  :)
	
>How-To-Repeat:
	For example, if your /etc is around 700K like mine, try:
	  dump bB0f 1 1000 /dev/null /etc
        This will fit in 1MB on /dev/null :), so it will
	estimate and take 1 ``tape''.

	Now, use 10K dump records, with 100 records per
	``tape'', which is still 1MB per ``tape'':
          dump bB0f 10 100 /dev/null /etc
        This will estimate 10 tapes, and will prompt for 10
	tapes, instead of just 1 tape.
	
>Fix:
        This is all based off of dump.c version 1.14 (current
	as of 5am this morning).  Also, since I did
	cut-and-paste, all of the leading tabs unfortunately
	are just spaces in this message.  If this is an
	inconvenience, I can mail improved patches to whomever
	patches this.

	1.  line 504 and 506:  Since msg() prepends "DUMP:" to its output
	  the command msg("DUMP: ...") results in:
	  DUMP: DUMP: ....
	  To fix, remove the leading "DUMP: " from the parameter
	  to msg:
--- main.c.orig Wed Sep 10 11:28:39 1997
+++ main.c      Wed Sep 10 11:28:51 1997
@@ -501,9 +501,9 @@
        for (i = 0; i < ntrec; i++)
                writeheader(maxino - 1);
        if (pipeout)
-               msg("DUMP: %ld tape blocks\n",spcl.c_tapea);
+               msg("%ld tape blocks\n",spcl.c_tapea);
        else
-               msg("DUMP: %ld tape blocks on %d volume%s\n",
+               msg("%ld tape blocks on %d volume%s\n",
                    spcl.c_tapea, spcl.c_volume,
                    (spcl.c_volume == 1) ? "" : "s");
        tnow = do_stats();


	2.  Initialize blocksperfile:
--- main.c.orig Wed Sep 10 11:30:44 1997
+++ main.c      Wed Sep 10 11:30:55 1997
@@ -90,7 +90,7 @@
 int    ntrec = NTREC;  /* # tape blocks in each tape record */
 int    cartridge = 0;  /* Assume non-cartridge tape */
 long   dev_bsize = 1;  /* recalculated below */
-long   blocksperfile;  /* output blocks per file */
+long   blocksperfile = 0;      /* output blocks per file */
 char   *host = NULL;   /* remote host (if any) */
 
 static long numarg __P((char *, long, long));

	3.  Fix usage of blocksperfile (which is really records
	per file):
--- main.c.orig Wed Sep 10 11:32:32 1997
+++ main.c      Wed Sep 10 11:34:25 1997
@@ -409,7 +409,7 @@
                double fetapes;
 
                if (blocksperfile)
-                       fetapes = (double) tapesize / blocksperfile;
+                       fetapes = (double) tapesize / (blocksperfile * ntrec);
                else if (cartridge) {
                        /* Estimate number of tapes, assuming
                         * streaming stops at
                           the end of each block written, and not in mid-block.
--- tape.c.orig Wed Sep 10 11:40:22 1997
+++ tape.c      Wed Sep 10 11:40:25 1997
@@ -362,7 +362,7 @@
        blockswritten += ntrec;
        blocksthisvol += ntrec;
        if (!pipeout && (blocksperfile ?
-           (blocksthisvol >= blocksperfile) : (asize > tsize))) {
+           (blocksthisvol >= blocksperfile*ntrec) : (asize > tsize))) {
                close_rewind();
                startnewtape(0);
        }

	4.  For renaming the blocksperfile variable -- I'll
	leave that up to the discretion of the patcher as to a
	rename versus a new variable versus no change.  :)

	5.  Here are patches to main.c and dump.8 for the -e
	option.  Notice that the getopt() call now takes up
	more than 80 characters.  I'm not sure what's the best
	way of breaking this up.  My best(?) idea was
	  #define GETOPT_ARGS	"0123....."
	  ...getopt(argc, argv, GETOPT_ARGS)...

--- main.c.orig Wed Sep 10 11:55:40 1997
+++ main.c      Wed Sep 10 12:10:06 1997
@@ -113,6 +113,7 @@
        time_t tnow;
        int dirlist;
        char *toplevel;
+       int just_estimate = 0;
 
        spcl.c_date = 0;
        (void)time((time_t *)&spcl.c_date);
@@ -130,7 +131,7 @@
                usage();
 
        obsolete(&argc, &argv);
-       while ((ch = getopt(argc, argv, "0123456789B:b:cd:f:h:ns:T:uWw")) != -1)
+       while ((ch = getopt(argc, argv, "0123456789B:b:cd:ef:h:ns:T:uWw")) != -1)
                switch (ch) {
                /* dump level */
                case '0': case '1': case '2': case '3': case '4':
@@ -157,6 +158,9 @@
                                ntrec = HIGHDENSITYTREC;
                        break;
 
+               case 'e':               /*  Exit after estimate of # tapes.  */
+                       just_estimate = 1;
+                       break;
                case 'f':               /* output file */
                        tape = optarg;
                        break;
@@ -447,6 +451,11 @@
                msg("estimated %ld tape blocks on %3.2f tape(s).\n",
                    tapesize, fetapes);
        }
+       /*  If the user only wants an estimate of the number of
+        *  tapes, exit now.
+        */
+       if (just_estimate) 
+       	exit(0);
 
        /*
         * Allocate tape buffer.

--- dump.8.orig Wed Sep 10 12:01:44 1997
+++ dump.8      Wed Sep 10 12:07:49 1997
@@ -42,7 +42,7 @@
 .Nd filesystem backup
 .Sh SYNOPSIS
 .Nm
-.Op Fl 0123456789cnu
+.Op Fl 0123456789cenu
 .Op Fl B Ar records
 .Op Fl b Ar blocksize
 .Op Fl d Ar density
@@ -123,6 +123,9 @@
 Set tape density to
 .Ar density .
 The default is 1600BPI.
+.It Fl e
+Estimate the number of tapes required, but don't actually perform
+the dump.
 .It Fl f Ar file
 Write the backup to
 .Ar file ;

	6.  Add in the use of spcl.c_filesys, the ``pretty''
	name for the dumped directory, and modify the
	disk/device printout.
--- main.c.orig Wed Sep 10 12:24:12 1997
+++ main.c      Wed Sep 10 12:31:32 1997
@@ -358,9 +358,10 @@
                spcl.c_date == 0 ? "the epoch\n" : ctime(&spcl.c_date));
        msg("Date of last level %c dump: %s", lastlevel,
                spcl.c_ddate == 0 ? "the epoch\n" : ctime(&spcl.c_ddate));
-       msg("Dumping %s ", disk);
-       if (dt != NULL)
-               msgtail("(%s) ", dt->fs_file);
+       msg("Dumping ");
+       if (dt != NULL && dirlist != 0)
+		msgtail("a subset of ");
+       msgtail("%s (%s) ", disk, spcl.c_filesys);
        if (host)
                msgtail("to %s on host %s\n", tape, host);
        else
>Audit-Trail:
>Unformatted: