Subject: bin/30428: pax 'remove leading /' and hardlink features don't cooperate
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <carton@Ivy.NET>
List: netbsd-bugs
Date: 06/04/2005 19:51:00
>Number:         30428
>Category:       bin
>Synopsis:       pax 'remove leading /' and hardlink features don't cooperate
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jun 04 19:51:00 +0000 2005
>Originator:     Miles Nordin
>Release:        NetBSD 2.0.2_STABLE
>Organization:
Ivy Ministries
>Environment:
System: NetBSD castrovalva 2.0.2_STABLE NetBSD 2.0.2_STABLE (CASTROVALVA-$Revision: 1.10 $) #0: Wed Apr 27 23:41:50 EDT 2005 carton@castrovalva:/scratch/src/sys/arch/alpha/compile/CASTROVALVA alpha
Architecture: alpha
Machine: alpha
>Description:
If an archive includes members with a leading / in their pathname that 
are hardlinks, and you extract it counting on leading-slash removal, it 
won't extract properly---it'll either fail, or silently hardlink to the 
wrong files.

>How-To-Repeat:
$ mount
/dev/sd0a on / type ffs (soft dependencies, local)
/dev/sd0d on /usr type ffs (soft dependencies, local)
mfs:435 on /tmp type mfs (synchronous, nodev, local)
$ mkdir /tmp/t ; pax -w /rescue | (cd /tmp/t && pax -r)
pax: Removing leading / from absolute path names in the archive
pax: Cannot link to /rescue/cat from rescue/chio (Cross-device link)
[...]
pax: Cannot link to /rescue/cat from rescue/gunzip (Cross-device link)
pax: Cannot link to /rescue/cat from rescue/gzcat (Cross-device link)
pax: Cannot link to /rescue/cat from rescue/zcat (Cross-device link)
$ rm -rf /tmp/t
$ mkdir /tmp/t ; pax -w ./rescue | (cd /tmp/t && pax -r)
$

when extracting on a filesystem other than '/', the ./rescue command works 
fine while /rescue has problems.

Note that 'pax -rw' also works fine.  It is only 'pax -w | pax -r' that has 
the problem.

>Fix:
This will require some argument.

speaking as sysadmin, I think removing leading slashes MUST be done, at least 
by default, on extract no matter what any standards documents say.  In 
addition, removing leading slashes MUST be done consistently for files and 
hardlinks.

removing leading slashes probably should be done by default on creation 
whether invoked as 'pax' or 'tar' or 'cpio', but maybe some standards 
document or emulated behavior argues against that, and if so I could live 
with it.

The current _default_ behavior when invoked _as 'pax'_ seems to be:

 * remove leading slashes on creation, and on extraction, from the filename

 * don't remove leading slashes on creation or extraction, from the link name

when invoked as 'tar' or invoked with options, the behavior is different.

I think the underlying mistake is using two variables: 'Aflag' and 
'rmleadslash', to do the same thing.  Also, it's done twice---the 'Aflag' 
code path removes leading slashes paste-and-rape style in 'list()' 
'extract()' and 'wr_archive()', while the 'rmleadslash' flag removes them 
in a proper factored-out way in 'mod_name()' that's sensitive to the link 
name.

Questions I have:

 * is removal of leading / supposed to be done before or after filename 
   matching?  Existing code does it before, my change does it after, 
   concurrent with pattern substitution.

 * is removal of leading / supposed to be done before or after -s pattern 
   replacement?  I think it should be done after, so the patterns you 
   write match the filenames you specify, and so you get the security of 
   leading-slash-removal even if you make a mistake in the pattern.  The 
   existing code does it before, and my patch doesn't change that.  We 
   should fix that seperately if others agree it's a good idea.

This patch removes leading slashes by default on creation and extraction, 
with a warning, and fixes my problem.  It also cleans up the redundant 
flag variables.

Index: ar_subs.c
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/ar_subs.c,v
retrieving revision 1.1.1.7
diff -u -r1.1.1.7 ar_subs.c
--- ar_subs.c	13 Nov 2004 09:54:58 -0000	1.1.1.7
+++ ar_subs.c	4 Jun 2005 18:37:32 -0000
@@ -118,9 +118,6 @@
 			continue;
 		}
 
-		if (arcn->name[0] == '/' && !check_Aflag()) {
-			memmove(arcn->name, arcn->name + 1, strlen(arcn->name));
-		}
 		/*
 		 * check for pattern, and user specified options match.
 		 * When all patterns are matched we are done.
@@ -214,9 +211,6 @@
 			continue;
 		}
 
-		if (arcn->name[0] == '/' && !check_Aflag()) {
-			memmove(arcn->name, arcn->name + 1, strlen(arcn->name));
-		}
 		/*
 		 * check for pattern, and user specified options match. When
 		 * all the patterns are matched we are done
@@ -244,6 +238,9 @@
 		 * file AFTER the name mod. In honesty the pax spec is probably
 		 * flawed in this respect.  ignore this for GNU long links.
 		 */
+		if (arcn->name[0] == '/' && !check_rmleadslash()) {
+			memmove(arcn->name, arcn->name + 1, strlen(arcn->name));
+		}
 		if ((uflag || Dflag) && ((lstat(arcn->name, &sb) == 0))) {
 			if (uflag && Dflag) {
 				if ((arcn->sb.st_mtime <= sb.st_mtime) &&
@@ -482,9 +479,6 @@
 			}
 		}
 
-		if (arcn->name[0] == '/' && !check_Aflag()) {
-			memmove(arcn->name, arcn->name + 1, strlen(arcn->name));
-		}
 		/*
 		 * Now modify the name as requested by the user
 		 */
Index: extern.h
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/extern.h,v
retrieving revision 1.1.1.8
diff -u -r1.1.1.8 extern.h
--- extern.h	15 Nov 2004 17:49:15 -0000	1.1.1.8
+++ extern.h	4 Jun 2005 18:27:02 -0000
@@ -183,7 +183,7 @@
 unsigned long long asc_ull(char *, int, int);
 int ull_asc(unsigned long long, char *, int, int);
 #endif
-int check_Aflag(void);
+int check_rmleadslash(void);
 
 /*
  * getoldopt.c
@@ -219,7 +219,6 @@
  */
 extern int act;
 extern FSUB *frmt;
-extern int Aflag;
 extern int cflag;
 extern int cwdfd;
 extern int dflag;
Index: gen_subs.c
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/gen_subs.c,v
retrieving revision 1.1.1.5
diff -u -r1.1.1.5 gen_subs.c
--- gen_subs.c	12 Dec 2003 11:06:36 -0000	1.1.1.5
+++ gen_subs.c	4 Jun 2005 18:31:09 -0000
@@ -394,15 +394,13 @@
 #endif
 
 int
-check_Aflag(void)
+check_rmleadslash(void)
 {
 
-	if (Aflag > 0)
-		return 1;
-	if (Aflag == 0) {
-		Aflag = -1;
+	if (rmleadslash == 1) {
+		rmleadslash = 2;
 		tty_warn(0,
 		 "Removing leading / from absolute path names in the archive");
 	}
-	return 0;
+	return rmleadslash;
 }
Index: options.c
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/options.c,v
retrieving revision 1.1.1.7
diff -u -r1.1.1.7 options.c
--- options.c	16 Aug 2004 00:43:29 -0000	1.1.1.7
+++ options.c	4 Jun 2005 18:44:07 -0000
@@ -443,8 +443,8 @@
 			gzip_program = GZIP_CMD;
 			break;
 		case 'A':
-			Aflag = 1;
-			flg |= CAF;
+			rmleadslash = 0;
+			flg |= CAF; /* -A not valid with copy mode */
 			break;
 		case 'B':
 			/*
@@ -994,7 +994,6 @@
 			 * do not remove leading '/' from pathnames
 			 */
 			rmleadslash = 0;
-			Aflag = 1;
 			break;
 		case 'X':
 			/*
Index: pat_rep.c
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/pat_rep.c,v
retrieving revision 1.1.1.5
diff -u -r1.1.1.5 pat_rep.c
--- pat_rep.c	12 Dec 2003 11:06:36 -0000	1.1.1.5
+++ pat_rep.c	4 Jun 2005 18:32:09 -0000
@@ -655,7 +655,7 @@
 	 * Strip off leading '/' if appropriate.
 	 * Currently, this option is only set for the tar format.
 	 */
-	if (rmleadslash && arcn->name[0] == '/') {
+	if (check_rmleadslash() && arcn->name[0] == '/') {
 		if (arcn->name[1] == '\0') {
 			arcn->name[0] = '.';
 		} else {
@@ -663,12 +663,8 @@
 			    strlen(arcn->name));
 			arcn->nlen--;
 		}
-		if (rmleadslash < 2) {
-			rmleadslash = 2;
-			tty_warn(0, "Removing leading / from absolute path names in the archive");
-		}
 	}
-	if (rmleadslash && arcn->ln_name[0] == '/' &&
+	if (check_rmleadslash() && arcn->ln_name[0] == '/' &&
 	    (arcn->type == PAX_HLK || arcn->type == PAX_HRG)) {
 		if (arcn->ln_name[1] == '\0') {
 			arcn->ln_name[0] = '.';
@@ -677,10 +673,6 @@
 			    strlen(arcn->ln_name));
 			arcn->ln_nlen--;
 		}
-		if (rmleadslash < 2) {
-			rmleadslash = 2;
-			tty_warn(0, "Removing leading / from absolute path names in the archive");
-		}
 	}
 
 	if (secure) {
Index: pax.c
===================================================================
RCS file: /scratch/cvsroot/netbsd/src/bin/pax/pax.c,v
retrieving revision 1.1.1.7
diff -u -r1.1.1.7 pax.c
--- pax.c	16 Aug 2004 00:43:29 -0000	1.1.1.7
+++ pax.c	4 Jun 2005 18:27:30 -0000
@@ -86,7 +86,6 @@
 int	uflag;			/* ignore older modification time files */
 int	vflag;			/* produce verbose output */
 int	zflag;			/* use gzip */
-int	Aflag;			/* honor absolute path */
 int	Dflag;			/* same as uflag except inode change time */
 int	Hflag;			/* follow command line symlinks (write only) */
 int	Lflag;			/* follow symlinks when writing */
@@ -101,7 +100,7 @@
 int	pfflags = 1;		/* preserve file flags */
 int	pmode;			/* preserve file mode bits */
 int	pids;			/* preserve file uid/gid */
-int	rmleadslash = 0;	/* remove leading '/' from pathnames */
+int	rmleadslash = 1;	/* remove leading '/' from pathnames */
 int	exit_val;		/* exit value */
 int	docrc;			/* check/create file crc */
 int	to_stdout;		/* extract to stdout */