tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Optimising check-interpreter



On Tue, Sep 02, 2014 at 04:03:37PM +0100, Jonathan Perkin wrote:
 > This one is probably more controversial, but the gains are so large
 > that it's worth raising for discussion.
 > 
 > I'd like to change mk/check/check-interpreter.mk so that it only
 > checks executable files.  The rationale being that I can't think of
 > any cases where we'd care what the shebang of a non-executable file
 > is, and if a file is erroneously non-executable when it should be
 > executable, then that should be fixed prior to the check being run.

That seems reasonable to me... as long as we have some reason to
believe that non-executable and therefore unchecked files aren't going
to become executable after the check, perhaps due to the activities of
INSTALL or whatnot.

I wouldn't think that this would happen, but still...

Maybe instead the script could be converted to something higher-
performance? Right now it does a shell read loop over every file, and
those are notoriously slow in addition to requiring multiple execs per
file. It should be possible to do something along the lines of

${_CHECK_INTERP_FILELIST_CMD} | ${XARGS} ${GREP} -Hn '^#!' | \
   ${SED} -n 's/^\([^:]*\):1:#![[:space:]]*\([^[:space:]]*\).*/\2 \1/p' | \
   ${SORT} | ${AWK} '
        {
            if ($$1 != previnterp) { doprint(); previnterp = $$1; } 
            files[++nfiles] = $$2;
        }
        function doprint() {
            if (previnterp == "") {
                return;
            }
            printf "%s", previnterp;
            for (i=1;i<=nfiles;i++) {
                printf " %s", files[i];
            }
            printf "\n";
            delete files;
            nfiles = 0;
        }
   ' | while read interp files; do
        bad=''
        case "$$interp" of
            /bin/env | /usr/bin/env)
                bad="is not allowed"
            ;;
            *)
                if [ ! -f "$$interp" -a ! -f ${DESTDIR}"$$interp" ]; then
                    bad="does not exist"
                fi
            ;;
        esac
        if [ "x$bad" != x ]; then
            echo "[check-interpreter.mk] The interpreter $interp $bad in:"
            for f in $files; do
                echo "[check-interpreter.mk]     $f"
            done
        fi
   done

which is still a shell read loop, but loops only once per distinct
interpreter; this will normally be a small number. It could probably
be made to not loop, but then safe quoting of the interpreter strings
becomes a bit of a problem.

Of course, this assumes we're allowed to use xargs here, and I'm not
clear on that...

-- 
David A. Holland
dholland%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index