tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

fetch: download to temporary location



Hi!

Especially in bulk builds it can happen that multiple versions of the
same package are being built at the same time (e.g. a Python module),
without the distfile being present. Then two downloads are started,
both to the same location, and one of them finishes and tries to
verify the incomplete file written by the other, leading to checksum
errors, breaking the build.

For this reason I've changed the fetch script to download to a
temporary file instead, and move the file to the final location when
the download is finished - so the file is there or not, but not a
partially downloaded copy.

The script already supported resuming downloads - the support is still
there but basically will never happen again, because it only looks at
a partial file of the correct name (which can still happen if the
download finishes but provides an incomplete file, and the file cannot
be verified by during download e.g. because the distinfo file is not
available - a theoretically supported use case).

Instead we might end up with more foo.tar.gz.12345 files, when
downloads are interrupted.

I think these are acceptable trade-offs.

Please try it out, I plan to commit this in a week.

Cheers,
 Thomas
Index: fetch
===================================================================
RCS file: /cvsroot/pkgsrc/mk/fetch/fetch,v
retrieving revision 1.22
diff -u -r1.22 fetch
--- fetch	7 Feb 2026 15:15:02 -0000	1.22
+++ fetch	8 Feb 2026 11:59:50 -0000
@@ -45,6 +45,9 @@
 #	If the file cannot be fetched successfully, then we try the next
 #	listed site.
 #
+#	The file will be downloaded under a temporary name and
+#	renamed to the proper name when the download is done.
+#
 #	If the file already exists on the disk and is verified, then
 #	no fetch action is taken.
 #
@@ -189,7 +192,7 @@
 	done < $distinfo
 fi
 
-# verify_file [-v] $file
+# verify_file [-v] $file $suffix
 #	If we can checksum the file, then see if it matches the listed
 #	checksums in the distinfo file.  If we can check the size, then
 #	check that instead.  We strip off ".pkgsrc.resume" from the
@@ -199,12 +202,13 @@
 verify_file() {
 	_if_verbose=:; if [ "x$1" = "x-v" ]; then shift; _if_verbose=; fi
 	_file="${1#./}"
+	_suffix="$2"
 	${TEST} -f $_file || {
 		$_if_verbose ${ECHO} 1>&2 "$self: File $_file does not exist."
 		return 1
 	}
 	if ${TEST} -n "$checksum"; then
-		${CHECKSUM} -s ".pkgsrc.resume" $distinfo ${_file} || {
+		${CHECKSUM} -s "${_suffix}" $distinfo ${_file} || {
 			$_if_verbose ${ECHO} 1>&2 "$self: Checksum of the file $_file doesn't match."
 			return 1
 		}
@@ -223,7 +227,7 @@
 # If the file already exists and it verifies, then we don't need to fetch
 # it again.
 #
-if verify_file $path; then
+if verify_file $path ""; then
 	exit 0
 fi
 
@@ -232,32 +236,23 @@
 ${TEST} -w $fetchdir || ${ECHO} 1>&2 "$self: WARNING: DISTDIR `cd $fetchdir && pwd` looks non-writable."
 
 # Set the name of the output file.  In the "resume" case, we initialize
-# the fetch loop by ensuring that the temporary output file already
-# exists.
+# the fetch loop by providing the existing, possibly incomplete, file.
 #
-outputfile="$file"
+tmp_suffix=".$$"
+outputfile="$file$tmp_suffix"
 outputpath="$fetchdir/$outputfile"
+if ${TEST} -f $outputpath; then
+    ${ECHO} "Temporary download file ${outputpath} already existed, deleting"
+    ${RM} -f $outputpath
+fi
 if ${TEST} -n "$resume"; then
-	outputfile="${file}.pkgsrc.resume"
-	outputpath="$fetchdir/$outputfile"
-	if ${TEST} ! -f $outputpath; then
-		if ${TEST} -f $path; then
-			${CP} -f $path $outputpath
-		else
-			${RM} -f $outputpath
-			${TOUCH} $outputpath
-		fi
-	fi
-	#
-	# If the temporary file verifies, then we don't need to resume
-	# fetching it.
-	#
-	if verify_file $outputpath; then
-		${MV} -f $outputpath $path
-		exit 0
+	if ${TEST} -f $path; then
+		${CP} -f $path $outputpath
+	else
+		${TOUCH} $outputpath
 	fi
 	size=`${WC} -c < $outputpath`
-	${ECHO} "=> Downloaded size: $size bytes"
+	${ECHO} "=> Downloaded size (before resume): $size bytes"
 fi
 ${TEST} -z "$distsize" || ${ECHO} "=> Total size: $distsize $distunits"
 
@@ -287,8 +282,8 @@
 		${ECHO} 1>&2 "$self: Unable to fetch expected file $file"
 		continue
 	fi
-	if verify_file -v $outputpath; then
-		${TEST} -z "$resume" || ${MV} -f $outputpath $path
+	if verify_file -v $outputpath $tmp_suffix; then
+		${MV} -f $outputpath $path
 		break
 	fi
 	if ${TEST} -n "$resume"; then


Home | Main Index | Thread Index | Old Index