Subject: Re: FYI: upgrading GNU tar
To: Christos Zoulas <christos@zoulas.com>
From: Greg A. Woods <woods@weird.com>
List: tech-userlevel
Date: 10/11/2002 15:49:00
[ On Friday, October 11, 2002 at 14:19:58 (-0400), Christos Zoulas wrote: ]
> Subject: Re: FYI: upgrading GNU tar
>
> Ok, added --files-from and -T as alias to -I. Tar file in the same place.
> what else?

Only very minor nits I think (plus documentation -- the tar(1) and
cpio(1) included below do not exactly match the new code, and I had some
fixes to pax(1) that are not included, nor did I check the usage
messages against reality).

I haven't really tested much, but I did compile and do some quick checks
(with -DSMALL -- my test machine has no "mtree.h").

The cpio stuff may need more careful examination too.

Eventualy ustar_st*() should be garbage collected perhaps, though I
guess they're only called once.

diffs against your "second" version.....

--- Makefile-ORIG	Fri Oct 11 11:29:46 2002
+++ Makefile	Fri Oct 11 15:34:36 2002
@@ -44,8 +44,7 @@
 		${NETBSDSRCDIR}/bin/ls
 .endif
 
-# XXX not yet!
-# MAN=	pax.1 tar.1 cpio.1
+MAN=	pax.1 tar.1 cpio.1
 LINKS+=	${BINDIR}/pax ${BINDIR}/tar
 LINKS+=	${BINDIR}/pax ${BINDIR}/cpio
 
--- buf_subs.c-ORIG	Fri Oct 11 09:06:22 2002
+++ buf_subs.c	Fri Oct 11 14:29:52 2002
@@ -113,6 +113,12 @@
 		    wrblksz, BLKMULT);
 		return(-1);
 	}
+	if (wrblksz > MAXBLK_POSIX) {
+		tty_warn(0, "Write block size of %d larger than POSIX max %d, archive may not be portable",
+		    wrblksz, MAXBLK_POSIX);
+		return(-1);
+	}
+
 
 	/*
 	 * we only allow wrblksz to be used with all archive operations
--- options.c-ORIG	Fri Oct 11 14:18:40 2002
+++ options.c	Fri Oct 11 15:01:49 2002
@@ -894,9 +894,11 @@
 			break;
 		case 'x':
 			/*
-			 * write an archive
+			 * extract an archive, preserving mode,
+			 * and mtime if possible.
 			 */
 			act = EXTRACT;
+			pmtime = 1;
 			break;
 		case 'z':
 			/*
--- pax.h-ORIG	Fri Oct 11 10:34:19 2002
+++ pax.h	Fri Oct 11 15:29:40 2002
@@ -55,6 +55,7 @@
 #define	MAXBLK		32256	/* MAX blocksize supported (posix SPEC) */
 				/* WARNING: increasing MAXBLK past 32256 */
 				/* will violate posix spec. */
+#define	MAXBLK_POSIX	32256	/* MAX blocksize supported as per POSIX */
 #define BLKMULT		512	/* blocksize must be even mult of 512 bytes */
 				/* Don't even think of changing this */
 #define DEVBLK		8192	/* default read blksize for devices */
--- sel_subs.c-ORIG	Fri Oct 11 09:06:24 2002
+++ sel_subs.c	Fri Oct 11 15:12:19 2002
@@ -591,17 +591,21 @@
 		}
 		/* FALLTHROUGH */
 	case 8:
-		lt->tm_mon = ATOI2(p);
+		if ((lt->tm_mon = ATOI2(p)) > 12)
+			return(-1);
 		--lt->tm_mon;
 		/* FALLTHROUGH */
 	case 6:
-		lt->tm_mday = ATOI2(p);
+		if ((lt->tm_mday = ATOI2(p)) > 31)
+			return(-1);
 		/* FALLTHROUGH */
 	case 4:
-		lt->tm_hour = ATOI2(p);
+		if ((lt->tm_hour = ATOI2(p)) > 23)
+			return(-1);
 		/* FALLTHROUGH */
 	case 2:
-		lt->tm_min = ATOI2(p);
+		if ((lt->tm_min = ATOI2(p)) > 59)
+			return(-1);
 		break;
 	default:
 		return(-1);
--- tar.c-ORIG	Fri Oct 11 09:07:36 2002
+++ tar.c	Fri Oct 11 15:38:24 2002
@@ -1030,7 +1030,11 @@
 	 */
 	if (ul_oct(tar_chksm(hdblk, sizeof(HD_USTAR)), hd->chksum,
 	   sizeof(hd->chksum), 3))
-		goto out;
+		goto out;			/* XXX Something's wrong here
+						 * because a zero-byte file can
+						 * cause this to be done and
+						 * yet the resulting warning
+						 * seems incorrect */
 	if (wr_rdbuf(hdblk, sizeof(HD_USTAR)) < 0)
 		return(-1);
 	if (wr_skip((off_t)(BLKMULT - sizeof(HD_USTAR))) < 0)
--- tar.h-ORIG	Fri Oct 11 09:06:25 2002
+++ tar.h	Fri Oct 11 15:26:50 2002
@@ -119,12 +119,12 @@
 /*
  * default device names
  */
-#define	DEV_0		"/dev/rmt0"
-#define	DEV_1		"/dev/rmt1"
-#define	DEV_4		"/dev/rmt4"
-#define	DEV_5		"/dev/rmt5"
-#define	DEV_7		"/dev/rmt7"
-#define	DEV_8		"/dev/rmt8"
+#define	DEV_0		"/dev/rst0"
+#define	DEV_1		"/dev/rst1"
+#define	DEV_4		"/dev/rst4"
+#define	DEV_5		"/dev/rst5"
+#define	DEV_7		"/dev/rst7"
+#define	DEV_8		"/dev/rst8"
 #endif /* _PAX_ */
 
 /*
--- /dev/null	Fri Oct 11 15:38:41 2002
+++ cpio.1	Fri Oct 11 15:32:48 2002
@@ -0,0 +1,289 @@
+.\"
+.\" Copyright (c) 1997 SigmaSoft, Th. Lockert
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\" 3. All advertising materials mentioning features or use of this software
+.\"    must display the following acknowledgement:
+.\"      This product includes software developed by SigmaSoft, Th. Lockert.
+.\" 4. The name of the author may not be used to endorse or promote products
+.\"    derived from this software without specific prior written permission
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+.\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+.\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+.\"
+.\"	$OpenBSD: cpio.1,v 1.14 2000/11/10 17:52:02 aaron Exp $
+.\"
+.Dd February 16, 1997
+.Dt CPIO 1
+.Os
+.Sh NAME
+.Nm cpio
+.Nd copy file archives in and out
+.Sh SYNOPSIS
+.Nm cpio
+.Fl o
+.Op Fl aABcLvzZ
+.Op Fl C Ar bytes
+.Op Fl F Ar archive
+.Op Fl H Ar format
+.Op Fl O Ar archive
+.Ar "< name-list"
+.Op Ar "> archive"
+.Nm cpio
+.Fl i
+.Op Fl bBcdfmrsStuvzZ6
+.Op Fl C Ar bytes
+.Op Fl E Ar file
+.Op Fl F Ar archive
+.Op Fl H Ar format
+.Op Fl I Ar archive
+.Op Ar "pattern ..."
+.Op Ar "< archive"
+.Nm cpio
+.Fl p
+.Op Fl adlLmuv
+.Ar destination-directory
+.Ar "< name-list"
+.Sh DESCRIPTION
+The
+.Nm
+command copies files to and from a
+.Nm
+archive.
+.Pp
+The options are as follows:
+.Bl -tag -width Ds
+.It Fl o
+Create an archive.
+Reads the list of files to store in the
+archive from standard input, and writes the archive on standard
+output.
+.Bl -tag -width Ds
+.It Fl a
+Reset the access times on files that have been copied to the
+archive.
+.It Fl A
+Append to the specified archive.
+.It Fl B
+Set block size of output to 5120 bytes.
+.It Fl c
+Use ASCII format for
+.Nm
+header for portability.
+.It Fl C Ar bytes
+Set the block size of output to
+.Ar bytes .
+.It Fl F Ar archive
+.It Fl O Ar archive
+Use the specified file name as the archive to write to.
+.It Fl H Ar format
+Write the archive in the specified format.
+Recognized formats are:
+.Pp
+.Bl -tag -width sv4cpio -compact
+.It Ar bcpio
+Old binary
+.Nm
+format.
+.It Ar cpio
+Old octal character
+.Nm
+format.
+.It Ar sv4cpio
+SVR4 hex
+.Nm
+format.
+.It Ar tar
+Old tar format.
+.It Ar ustar
+POSIX ustar format.
+.El
+.It Fl L
+Follow symbolic links.
+.It Fl v
+Be verbose about operations.
+List filenames as they are written to the archive.
+.It Fl z
+Compress archive using
+.Xr gzip 1
+format.
+.It Fl Z
+Compress archive using
+.Xr compress 1
+format.
+.El
+.It Fl i
+Restore files from an archive.
+Reads the archive file from
+standard input and extracts files matching the
+.Ar patterns
+that were specified on the command line.
+.Bl -tag -width Ds
+.It Fl b
+Do byte and word swapping after reading in data from the
+archive, for restoring archives created on systems with
+a different byte order.
+.It Fl B
+Set the block size of the archive being read to 5120 bytes.
+.It Fl c
+Expect the archive headers to be in ASCII format.
+.It Fl C Ar bytes
+Read archive written with a block size of
+.Ar bytes .
+.It Fl d
+Create any intermediate directories as needed during
+restore.
+.It Fl E Ar file
+Read list of file name patterns to extract or list from
+.Ar file .
+.It Fl f
+Restore all files except those matching the
+.Ar patterns
+given on the command line.
+.It Fl F Ar archive
+.It Fl I Ar archive
+Use the specified file as the input for the archive.
+.It Fl H Ar format
+Read an archive of the specified format.
+Recognized formats are:
+.Pp
+.Bl -tag -width sv4cpio -compact
+.It Ar bcpio
+Old binary
+.Nm
+format.
+.It Ar cpio
+Old octal character
+.Nm
+format.
+.It Ar sv4cpio
+SVR4 hex
+.Nm
+format.
+.It Ar tar
+Old tar format.
+.It Ar ustar
+POSIX ustar format.
+.El
+.It Fl m
+Restore modification times on files.
+.It Fl r
+Rename restored files interactively.
+.It Fl s
+Swap bytes after reading data from the archive.
+.It Fl S
+Swap words after reading data from the archive.
+.It Fl t
+Only list the contents of the archive, no files or
+directories will be created.
+.It Fl u
+Overwrite files even when the file in the archive is
+older than the one that will be overwritten.
+.It Fl v
+Be verbose about operations.
+List filenames as they are copied in from the archive.
+.It Fl z
+Uncompress archive using
+.Xr gzip 1
+format.
+.It Fl Z
+Uncompress archive using
+.Xr compress 1
+format.
+.It Fl 6
+Process old-style
+.Nm
+format archives.
+.El
+.It Fl p
+Copy files from one location to another in a single pass.
+The list of files to copy are read from standard input and
+written out to a directory relative to the specified
+.Ar directory
+argument.
+.Bl -tag -width Ds
+.It Fl a
+Reset the access times on files that have been copied.
+.It Fl d
+Create any intermediate directories as needed to write
+the files at the new location.
+.It Fl l
+When possible, link files rather than creating an
+extra copy.
+.It Fl L
+Follow symbolic links.
+.It Fl m
+Restore modification times on files.
+.It Fl u
+Overwrite files even when the original file being copied is
+older than the one that will be overwritten.
+.It Fl v
+Be verbose about operations.
+List filenames as they are copied.
+.El
+.El
+.Sh ERRORS
+.Nm
+will exit with one of the following values:
+.Bl -tag -width 2n
+.It 0
+All files were processed successfully.
+.It 1
+An error occurred.
+.El
+.Pp
+Whenever
+.Nm
+cannot create a file or a link when extracting an archive or cannot
+find a file while writing an archive, or cannot preserve the user
+ID, group ID, file mode, or access and modification times when the
+.Fl p
+option is specified, a diagnostic message is written to standard
+error and a non-zero exit value will be returned, but processing
+will continue.
+In the case where
+.Nm
+cannot create a link to a file,
+.Nm
+will not create a second copy of the file.
+.Pp
+If the extraction of a file from an archive is prematurely terminated
+by a signal or error,
+.Nm
+may have only partially extracted the file the user wanted.
+Additionally, the file modes of extracted files and directories may
+have incorrect file bits, and the modification and access times may
+be wrong.
+.Pp
+If the creation of an archive is prematurely terminated by a signal
+or error,
+.Nm
+may have only partially created the archive which may violate the
+specific archive format specification.
+.Sh SEE ALSO
+.Xr pax 1 ,
+.Xr tar 1
+.Sh AUTHORS
+Keith Muller at the University of California, San Diego.
+.Sh BUGS
+The
+.Fl s
+and
+.Fl S
+options are currently not implemented.
--- /dev/null	Fri Oct 11 15:38:36 2002
+++ tar.1	Fri Oct 11 15:32:50 2002
@@ -0,0 +1,288 @@
+.\"
+.\" Copyright (c) 1996 SigmaSoft, Th. Lockert
+.\" All rights reserved.
+.\"
+.\" Redistribution and use in source and binary forms, with or without
+.\" modification, are permitted provided that the following conditions
+.\" are met:
+.\" 1. Redistributions of source code must retain the above copyright
+.\"    notice, this list of conditions and the following disclaimer.
+.\" 2. Redistributions in binary form must reproduce the above copyright
+.\"    notice, this list of conditions and the following disclaimer in the
+.\"    documentation and/or other materials provided with the distribution.
+.\" 3. All advertising materials mentioning features or use of this software
+.\"    must display the following acknowledgement:
+.\"      This product includes software developed by SigmaSoft, Th. Lockert.
+.\" 4. The name of the author may not be used to endorse or promote products
+.\"    derived from this software without specific prior written permission
+.\"
+.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+.\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+.\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+.\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+.\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+.\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+.\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+.\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+.\"
+.\"	$OpenBSD: tar.1,v 1.28 2000/11/09 23:58:56 aaron Exp $
+.\"
+.Dd January 31, 2001
+.Dt TAR 1
+.Os
+.Sh NAME
+.Nm tar
+.Nd tape archiver
+.Sh SYNOPSIS
+.Nm tar
+.Sm off
+.Oo \&- Oc {crtux} Op Fl befhmopqvwzHLOPXZ014578
+.Sm on
+.Op Ar archive
+.Op Ar blocksize
+.\" XXX how to do this right?
+.Op Fl C Ar directory
+.Op Fl T Ar file
+.Op Fl s Ar replstr
+.Op Ar file ...
+.Sh DESCRIPTION
+The
+.Nm
+command creates, adds files to, or extracts files from an
+archive file in
+.Dq tar
+format.  A tar archive is often stored on a magnetic tape, but can be
+stored equally well on a floppy, CD-ROM, or in a regular disk file.
+.Pp
+One of the following flags must be present:
+.Bl -tag -width Ar
+.It Fl c
+Create new archive, or overwrite an existing archive,
+adding the specified files to it.
+.It Fl r
+Append the named new files to existing archive.
+Note that this will only work on media on which an end-of-file mark
+can be overwritten.
+.It Fl t
+List contents of archive.
+If any files are named on the
+command line, only those files will be listed.
+.It Fl u
+Alias for
+.Fl r .
+.It Fl x
+Extract files from archive.
+If any files are named on the
+command line, only those files will be extracted from the
+archive.
+If more than one copy of a file exists in the
+archive, later copies will overwrite earlier copies during
+extraction.
+The file mode and modification time are preserved
+if possible.
+The file mode is subject to modification by the
+.Xr umask 2 .
+.El
+.Pp
+In addition to the flags mentioned above, any of the following
+flags may be used:
+.Bl -tag -width Ar
+.It Fl b Ar "blocking factor"
+Set blocking factor to use for the archive.
+.Nm
+uses 512 byte blocks.
+The default is 20, the maximum is 126.
+Archives with a blocking factor larger 63 violate the
+.Tn POSIX
+standard and will not be portable to all systems.
+.It Fl e
+Stop after first error.
+.It Fl f Ar archive
+Filename where the archive is stored.
+Defaults to
+.Pa /dev/rst0 .
+.It Fl h
+Follow symbolic links as if they were normal files
+or directories.
+.It Fl m
+Do not preserve modification time.
+.It Fl O
+Write old-style (non-POSIX) archives.
+.It Fl o
+Don't write directory information that the older (V7) style
+.Nm
+is unable to decode.
+This implies the
+.Fl O
+flag.
+.It Fl p
+Preserve user and group ID as well as file mode regardless of
+the current
+.Xr umask 2 .
+The setuid and setgid bits are only preserved if the user is
+the superuser.
+Only meaningful in conjunction with the
+.Fl x
+flag.
+.It Fl q
+Select the first archive member that matches each
+.Ar pattern
+operand.
+No more than one archive member is matched for each
+.Ar pattern .
+When members of type directory are matched, the file hierarchy rooted at that
+directory is also matched.
+.It Fl s Ar replstr
+Modify the file or archive member names specified by the
+.Ar pattern
+or
+.Ar file
+operands according to the substitution expression
+.Ar replstr ,
+using the syntax of the
+.Xr ed 1
+utility regular expressions.
+The format of these regular expressions are:
+.Dl /old/new/[gp]
+As in
+.Xr ed 1 ,
+.Cm old
+is a basic regular expression and
+.Cm new
+can contain an ampersand (&), \\n (where n is a digit) back-references,
+or subexpression matching.
+The
+.Cm old
+string may also contain
+.Dv <newline>
+characters.
+Any non-null character can be used as a delimiter (/ is shown here).
+Multiple
+.Fl s
+expressions can be specified.
+The expressions are applied in the order they are specified on the
+command line, terminating with the first successful substitution.
+The optional trailing
+.Cm g
+continues to apply the substitution expression to the pathname substring
+which starts with the first character following the end of the last successful
+substitution.
+The first unsuccessful substitution stops the operation of the
+.Cm g
+option.
+The optional trailing
+.Cm p
+will cause the final result of a successful substitution to be written to
+.Dv standard error
+in the following format:
+.Dl <original pathname> >> <new pathname>
+File or archive member names that substitute to the empty string
+are not selected and will be skipped.
+.It Fl v
+Verbose operation mode.
+.It Fl w
+Interactively rename files.
+This option causes
+.Nm
+to prompt the user for the filename to use when storing or
+extracting files in an archive.
+.It Fl z
+Compress archive using gzip.
+.It Fl C Ar directory
+This is a positional argument which sets the working directory for the
+following files.
+When extracting, files will be extracted into
+the specified directory; when creating, the specified files will be matched
+from the directory.
+This argument and its parameter may also appear in a file list specified by
+.Fl T .
+.It Fl H
+Follow symlinks given on command line only.
+.It Fl L
+Follow all symlinks.
+.Em Warning!
+This flag has the opposite meaning in some other versions of
+.Nm tar ,
+including the one in AT&T Bell Labs Research Tenth Edition, and its
+meaning is completely different and unrelated to symlinks in GNU Tar.
+.\" No wonder the world needs Pax!
+.It Fl P
+Do not strip leading slashes
+.Pq Sq /
+from pathnames.
+The default is to strip leading slashes.
+.It Fl T Ar file
+Read the names of files to archive or extract from the given file, one
+per line.
+A line may also specify the positional argument
+.Dq Fl C Ar directory .
+.It Fl X
+Do not cross mount points in the file system.
+.It Fl Z
+Compress archive using compress.
+.El
+.Pp
+The options
+.Op Fl 014578
+can be used to select one of the compiled-in backup devices,
+.Pa /dev/rstN .
+.Sh DIAGNOSTICS
+.Nm
+will exit with one of the following values:
+.Bl -tag -width 2n
+.It 0
+All files were processed successfully.
+.It 1
+An error occurred.
+.El
+.Pp
+Whenever
+.Nm
+cannot create a file or a link when extracting an archive or cannot
+find a file while writing an archive, or cannot preserve the user
+ID, group ID, file mode, or access and modification times when the
+.Fl p
+option is specified, a diagnostic message is written to standard
+error and a non-zero exit value will be returned, but processing
+will continue.
+In the case where
+.Nm
+cannot create a link to a file,
+.Nm
+will not create a second copy of the file.
+.Pp
+If the extraction of a file from an archive is prematurely terminated
+by a signal or error,
+.Nm
+may have only partially extracted the file the user wanted.
+Additionally, the file modes of extracted files and directories may
+have incorrect file bits, and the modification and access times may
+be wrong.
+.Pp
+If the creation of an archive is prematurely terminated by a signal
+or error,
+.Nm
+may have only partially created the archive which may violate the
+specific archive format specification.
+.Sh FILES
+.Bl -tag -width "/dev/rst0"
+.It Pa /dev/rst0
+default archive name
+.El
+.Sh SEE ALSO
+.Xr cpio 1 ,
+.Xr pax 1
+.Sh AUTHORS
+Keith Muller at the University of California, San Diego.
+.Sh HISTORY
+A
+.Nm
+command first appeared in
+.At v7 .
+.Sh BUGS
+The
+.Fl L
+flag is not portable to other versions of
+.Nm tar .


-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>