Subject: pkg/33229: mk/bsd.pkg.mk check-vulnerable: awk speedup (possibly 10x)
To: None <pkg-manager@netbsd.org, gnats-admin@netbsd.org,>
From: None <bdragon@mailsnare.net>
List: pkgsrc-bugs
Date: 04/10/2006 04:30:04
>Number:         33229
>Category:       pkg
>Synopsis:       mk/bsd.pkg.mk check-vulnerable: awk speedup (possibly 10x)
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    pkg-manager
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Mon Apr 10 04:30:03 +0000 2006
>Originator:     Brandon Bergren
>Release:        N/A
>Organization:
>Environment:
Linux overbeck 2.6.16.1-linode18-bb1 #1 Fri Mar 31 12:39:01 EST 2006 i686 prescott i386 GNU/Linux
>Description:
(overly long description and analysis ahead)
I was looking for ways to make check-vulnerable faster on my Slackware box, and found a possibly serious possible optimization in the awk script invoked by  that target. I was able to get a ~10x performance gain from awk.

Going back through the CVS history, the current implementation of the affected code is a speedup patch described in http://mail-index.netbsd.org/netbsd-bugs/2003/04/29/0035.html
(PR pkg/21393)

This method was designed to avoid spawning pkg_admin for every vulnerability. It was committed by agc in version 1.1175.

Since then, the affected code has been changed around a bit, but the actual patterns have stayed the same.

Here's my go at the awk script.

Round about line 1241 of mk/bsd.pkg.mk:
---
  PKGBASE=${PKGBASE:Q}                            \
  ${AWK} '/^$$/ { next }                          \
          /^#.*/ { next }                         \
          $$1 !~ ENVIRON["PKGBASE"] && $$1 !~ /\{/ { next } \
          { s = sprintf("${PKG_ADMIN} pmatch \"%s\" %s && ${ECHO} \"*** WARNING - %s vulnerability in %s - see
%s for more information ***\"", $$1, ENVIRON["PKGNAME"], $$2, ENVIRON["PKGNAME"], $$3); system(s); }' < ${PKGVULNDIR}/pkg-vulnerabili
ties || ${FALSE}; \
---

I changed it to the below and got a ~10x (!) speedup:
---
  PKGBASE=${PKGBASE:Q}                          \
  ${AWK} '/^'${PKGBASE}'.*/ { s = sprintf("${PKG_ADMIN} pmatch \"%s\" %s && ${ECHO} \"*** WARNING - %s vul
nerability in %s - see %s for more information ***\"", $$1, ENVIRON["PKGNAME"], $$2, ENVIRON["PKGNAME"], $$3); system(s); } ' < ${PKG
VULNDIR}/pkg-vulnerabilities || ${FALSE}; \
---

Also, here's a possibly more correct version (only matches exact basename -- ignores 'foobar' vulnerabilities when checking 'foo', removes additional debris):
---
  ${AWK} '/^'${PKGBASE:Q}'[<=>].*/ { s = sprintf("${PKG_ADMIN} pmatch \"%s\" %s && ${ECHO} \"*** WARNING - %s vul
nerability in %s - see %s for more information ***\"", $$1, ENVIRON["PKGNAME"], $$2, ENVIRON["PKGNAME"], $$3); system(s); } ' < ${PKG
VULNDIR}/pkg-vulnerabilities || ${FALSE}; \
---

And here's some benchmarks:

Benchmark 1 (a package with an active vulnerability)
---before---
root@overbeck:/usr/pkgsrc/devel/gdb# /usr/bin/time bmake check-vulnerable
*** WARNING - 1228,privilege-escalation vulnerability in gdb-5.3nb4 - see http://secunia.com/advisories/15449/ for more information ***
1.58user 1.59system 0:09.09elapsed 34%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+182942minor)pagefaults 0swaps
---
---after---
root@overbeck:/usr/pkgsrc/devel/gdb# /usr/bin/time bmake check-vulnerable
*** WARNING - 1228,privilege-escalation vulnerability in gdb-5.3nb4 - see http://secunia.com/advisories/15449/ for more information ***
0.17user 0.13system 0:00.71elapsed 42%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+13107minor)pagefaults 0swaps
---
Benchmark 2 (A package with past vulnerabilities)
---before---
root@overbeck:/usr/pkgsrc/devel/zlib# /usr/bin/time bmake check-vulnerable
1.69user 1.72system 0:09.75elapsed 34%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+184492minor)pagefaults 0swaps
---
---after---
root@overbeck:/usr/pkgsrc/devel/zlib# /usr/bin/time bmake check-vulnerable
0.16user 0.13system 0:00.74elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+13974minor)pagefaults 0swaps
---

Attached is the patch.

Thanks!
>How-To-Repeat:
Get annoyed at how slow check-vulnerable is on your box ;)
>Fix:
--- bsd.pkg.mk  2 Feb 2006 21:15:46 -0000       1.1798
+++ bsd.pkg.mk  10 Apr 2006 04:09:38 -0000
@@ -1238,11 +1238,7 @@
        fi;                                                             \
        if [ -f ${PKGVULNDIR}/pkg-vulnerabilities ]; then               \
                ${SETENV} PKGNAME=${PKGNAME:Q}                          \
-                         PKGBASE=${PKGBASE:Q}                          \
-                       ${AWK} '/^$$/ { next }                          \
-                               /^#.*/ { next }                         \
-                               $$1 !~ ENVIRON["PKGBASE"] && $$1 !~ /\{/ { next } \
-                               { s = sprintf("${PKG_ADMIN} pmatch \"%s\" %s && ${ECHO} \"*** WARNING - %s vulnerability in %s - see %s for more information ***\"", $$1, ENVIRON["PKGNAME"], $$2, ENVIRON["PKGNAME"], $$3); system(s); }' < ${PKGVULNDIR}/pkg-vulnerabilities || ${FALSE}; \
+                       ${AWK} '/^'${PKGBASE:Q}'[<=>].*/ { s = sprintf("${PKG_ADMIN} pmatch \"%s\" %s && ${ECHO} \"*** WARNING - %s vulnerability in %s - see %s for more information ***\"", $$1, ENVIRON["PKGNAME"], $$2, ENVIRON["PKGNAME"], $$3); system(s); } ' < ${PKGVULNDIR}/pkg-vulnerabilities || ${FALSE}; \
        fi

 .PHONY: do-fetch