Subject: Re: Meta package for pkgsrc developer tools
To: NetBSD Packages Technical Discussion List <tech-pkg@netbsd.org>
From: Ronald J. Roskens <roskens@sl.econet.com>
List: tech-pkg
Date: 08/08/2006 10:47:16
--=-fjlDWgMBkl7E6gBxsf09
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
On Mon, 2006-08-07 at 19:42 -0400, Greg A. Woods wrote:
> At Sun, 06 Aug 2006 02:51:13 +0200,
> Joerg Sonnenberger wrote:
> >
> > On Sat, Aug 05, 2006 at 03:50:43PM -0400, Greg A. Woods wrote:
> > > At Sat, 05 Aug 2006 15:18:26 +0200,
> > > Joerg Sonnenberger wrote:
> > > >
> > > > On Fri, Aug 04, 2006 at 07:43:35PM -0700, John Nemeth wrote:
> > > > > On Dec 25, 1:00pm, Lubomir Sedlacik wrote:
> > > > > } On Sat, Aug 05, 2006 at 01:09:36AM +0900, Ryo HAYASAKA wrote:
> > > > > } > I think pkgclean and pkgfind are useful tools, too.
> > > > > }
> > > > > } remind me, why do we have such an useless package as pkgclean in pkgsrc,
> > > > > } again?
> > > > > }
> > > > > } (if you don't want clutter in your pkgsrc tree, mount it read-only and
> > > > > } set WRKOBJDIR. and when you feel like it, just wipe WRKOBJDIR clean
> > > > > } with rm. it's even faster!)
> > > > >
> > > > > cd /usr/pkgsrc && rm -r */*/work is pretty fast as well.
> > > >
> > > > But not really portable.
> > > > find /usr/pkgsrc -name work -exec rm -r {} \;
> > > > doesn't hit command line limits.
> > >
> > > but that's not as fast or efficient it could/should be. Use xargs(1)!
> >
> > Ironically, xargs is not necessarily faster. The command above needs
> > more execs, but has the advantage that find doesn't visit the work
> > directories.
>
> Guess all the metadata for my pkgsrc tree must be in the buffer cache
> much of the time! :-)
>
> Sorry, I didn't actually think about the guts of the work directories
> themselves since I haven't actually used pkgsrc that way for half a
> decade or so. :-) I was more worried about the just CVS subdirectories
> chewing up unnecessary time and I/Os.
>
>
> The sad thing is the shell glob example is about two orders of magnitude
> faster (at least when you don't have that many work subdirs) than any
> form or use of find.
>
> This machine's pkgsrc tree hasn't been touched or looked at since it was
> booted (it also doesn't have any work subdirs, just a few work symlinks,
> but it does seem to have enough buffer cache to store the whole tree's
> metadata):
>
> 19:19 [56] $ time sh -c 'echo */*/work > /dev/null'
> 11.05s real 0.01s user 0.57s system
> 19:19 [57] $ time sh -c 'echo */*/work > /dev/null'
> 0.04s real 0.00s user 0.03s system
> 19:20 [58] $ time sh -c 'echo */*/work > /dev/null'
> 0.04s real 0.01s user 0.03s system
> 19:20 [59] $ time sh -c 'echo */*/work > /dev/null'
> 0.04s real 0.01s user 0.02s system
> 19:20 [60] $ time find . -name work -print > /dev/null
> 37.83s real 0.39s user 2.69s system
> 19:21 [61] $ time find . -name work -print > /dev/null
> 37.76s real 0.37s user 2.17s system
> 19:22 [62] $ time find . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null
> 28.24s real 0.22s user 1.25s system
> 19:23 [63] $ time find . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null
> 25.98s real 0.21s user 1.10s system
>
> Thankfully GNU Find isn't really any faster:
>
> 19:27 [70] $ time gfind . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null
> 26.76s real 0.19s user 1.29s system
> 19:28 [71] $ time gfind . -name work -print > /dev/null
> 37.26s real 0.39s user 2.05s system
>
>
> > Using maxdepth would have a similiar result.
>
> Hmmm.... "find: -maxdepth: unknown option" :-)
>
> Perhaps you're thinking about "-prune", as per above? Or GNU Find?
What does
/usr/bin/time find . -mindepth 3 -maxdepth 3 -name work > /dev/null
result in?
Ron
--=-fjlDWgMBkl7E6gBxsf09
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.10.1">
</HEAD>
<BODY>
On Mon, 2006-08-07 at 19:42 -0400, Greg A. Woods wrote:
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">At Sun, 06 Aug 2006 02:51:13 +0200,</FONT>
<FONT COLOR="#000000">Joerg Sonnenberger wrote:</FONT>
<FONT COLOR="#000000">> </FONT>
<FONT COLOR="#000000">> On Sat, Aug 05, 2006 at 03:50:43PM -0400, Greg A. Woods wrote:</FONT>
<FONT COLOR="#000000">> > At Sat, 05 Aug 2006 15:18:26 +0200,</FONT>
<FONT COLOR="#000000">> > Joerg Sonnenberger wrote:</FONT>
<FONT COLOR="#000000">> > > </FONT>
<FONT COLOR="#000000">> > > On Fri, Aug 04, 2006 at 07:43:35PM -0700, John Nemeth wrote:</FONT>
<FONT COLOR="#000000">> > > > On Dec 25, 1:00pm, Lubomir Sedlacik wrote:</FONT>
<FONT COLOR="#000000">> > > > } On Sat, Aug 05, 2006 at 01:09:36AM +0900, Ryo HAYASAKA wrote:</FONT>
<FONT COLOR="#000000">> > > > } > I think pkgclean and pkgfind are useful tools, too.</FONT>
<FONT COLOR="#000000">> > > > } </FONT>
<FONT COLOR="#000000">> > > > } remind me, why do we have such an useless package as pkgclean in pkgsrc,</FONT>
<FONT COLOR="#000000">> > > > } again?</FONT>
<FONT COLOR="#000000">> > > > } </FONT>
<FONT COLOR="#000000">> > > > } (if you don't want clutter in your pkgsrc tree, mount it read-only and</FONT>
<FONT COLOR="#000000">> > > > } set WRKOBJDIR. and when you feel like it, just wipe WRKOBJDIR clean</FONT>
<FONT COLOR="#000000">> > > > } with rm. it's even faster!)</FONT>
<FONT COLOR="#000000">> > > > </FONT>
<FONT COLOR="#000000">> > > > cd /usr/pkgsrc && rm -r */*/work is pretty fast as well.</FONT>
<FONT COLOR="#000000">> > > </FONT>
<FONT COLOR="#000000">> > > But not really portable.</FONT>
<FONT COLOR="#000000">> > > find /usr/pkgsrc -name work -exec rm -r {} \;</FONT>
<FONT COLOR="#000000">> > > doesn't hit command line limits.</FONT>
<FONT COLOR="#000000">> > </FONT>
<FONT COLOR="#000000">> > but that's not as fast or efficient it could/should be. Use xargs(1)!</FONT>
<FONT COLOR="#000000">> </FONT>
<FONT COLOR="#000000">> Ironically, xargs is not necessarily faster. The command above needs</FONT>
<FONT COLOR="#000000">> more execs, but has the advantage that find doesn't visit the work</FONT>
<FONT COLOR="#000000">> directories.</FONT>
<FONT COLOR="#000000">Guess all the metadata for my pkgsrc tree must be in the buffer cache</FONT>
<FONT COLOR="#000000">much of the time! :-)</FONT>
<FONT COLOR="#000000">Sorry, I didn't actually think about the guts of the work directories</FONT>
<FONT COLOR="#000000">themselves since I haven't actually used pkgsrc that way for half a</FONT>
<FONT COLOR="#000000">decade or so. :-) I was more worried about the just CVS subdirectories</FONT>
<FONT COLOR="#000000">chewing up unnecessary time and I/Os.</FONT>
<FONT COLOR="#000000">The sad thing is the shell glob example is about two orders of magnitude</FONT>
<FONT COLOR="#000000">faster (at least when you don't have that many work subdirs) than any</FONT>
<FONT COLOR="#000000">form or use of find.</FONT>
<FONT COLOR="#000000">This machine's pkgsrc tree hasn't been touched or looked at since it was</FONT>
<FONT COLOR="#000000">booted (it also doesn't have any work subdirs, just a few work symlinks,</FONT>
<FONT COLOR="#000000">but it does seem to have enough buffer cache to store the whole tree's</FONT>
<FONT COLOR="#000000">metadata):</FONT>
<FONT COLOR="#000000">19:19 [56] $ time sh -c 'echo */*/work > /dev/null'</FONT>
<FONT COLOR="#000000"> 11.05s real 0.01s user 0.57s system</FONT>
<FONT COLOR="#000000">19:19 [57] $ time sh -c 'echo */*/work > /dev/null'</FONT>
<FONT COLOR="#000000"> 0.04s real 0.00s user 0.03s system</FONT>
<FONT COLOR="#000000">19:20 [58] $ time sh -c 'echo */*/work > /dev/null'</FONT>
<FONT COLOR="#000000"> 0.04s real 0.01s user 0.03s system</FONT>
<FONT COLOR="#000000">19:20 [59] $ time sh -c 'echo */*/work > /dev/null'</FONT>
<FONT COLOR="#000000"> 0.04s real 0.01s user 0.02s system</FONT>
<FONT COLOR="#000000">19:20 [60] $ time find . -name work -print > /dev/null</FONT>
<FONT COLOR="#000000"> 37.83s real 0.39s user 2.69s system</FONT>
<FONT COLOR="#000000">19:21 [61] $ time find . -name work -print > /dev/null</FONT>
<FONT COLOR="#000000"> 37.76s real 0.37s user 2.17s system</FONT>
<FONT COLOR="#000000">19:22 [62] $ time find . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null</FONT>
<FONT COLOR="#000000"> 28.24s real 0.22s user 1.25s system</FONT>
<FONT COLOR="#000000">19:23 [63] $ time find . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null</FONT>
<FONT COLOR="#000000"> 25.98s real 0.21s user 1.10s system</FONT>
<FONT COLOR="#000000">Thankfully GNU Find isn't really any faster:</FONT>
<FONT COLOR="#000000">19:27 [70] $ time gfind . \( -name CVS -prune \) -o \( -name work -print \) > /dev/null</FONT>
<FONT COLOR="#000000"> 26.76s real 0.19s user 1.29s system</FONT>
<FONT COLOR="#000000">19:28 [71] $ time gfind . -name work -print > /dev/null</FONT>
<FONT COLOR="#000000"> 37.26s real 0.39s user 2.05s system</FONT>
<FONT COLOR="#000000">> Using maxdepth would have a similiar result.</FONT>
<FONT COLOR="#000000">Hmmm.... "find: -maxdepth: unknown option" :-)</FONT>
<FONT COLOR="#000000">Perhaps you're thinking about "-prune", as per above? Or GNU Find</FONT>?
</PRE>
</BLOCKQUOTE>
<BR>
What does<BR>
<BR>
/usr/bin/time find . -mindepth 3 -maxdepth 3 -name work > /dev/null<BR>
<BR>
result in?<BR>
<BR>
Ron
</BODY>
</HTML>
--=-fjlDWgMBkl7E6gBxsf09--