Subject: Re: Recursive grep (where is limfree defined?)
To: Charles M. Hannum <mycroft@NetBSD.ORG>
From: Don Lewis <gdonl@gv.ssi1.com>
List: current-users
Date: 02/02/1996 03:12:35
On Jan 30, 3:00am, "Charles M. Hannum" wrote:
} Subject: Re: Recursive grep (where is limfree defined?)
}
} If you're going to port something, _port_ it. Make it work EVERYwhere.
} Otherwise it's as useless to the general populace as the scripts I keep
} around for doing my job.
}
} By that logic, we would be compelled to remove large chunks of our
} source tree. Such an attitude is why it often takes years for a
} better tool to be adopted (c.f. compress vs. gzip wars), and if
} anything only stunts progress.
}
} On the other hand, I have mixed feeling about a `recursive grep'.
} This is not an endorsement.
I'm also not sure that `recursive grep' is an improvement.
1) It is argued that beginning users won't know that they can use
find to execute grep recursively, whereas the grep man page
would tell them about -R. Ok, beginning users get off easy until
they need to run some other command recursively at which time
they need to learn about find.
2) The new options to grep that control recursion are slightly different
than the recursion options on chmod, chown, etc. due to a collision
with an existing grep option. If other commands have recursion
added, how many different variations will there be? Is it easier
to memorize N variations on the options for recursion (or look
them up each time), or just learn find which always works the same
way.
3) There's no way to just grep *.c files or whatever (unless you
add another option to grep to specify a glob pattern). If you
want to do this, you have to learn how to use find.
4) Even with the new grep option to exclude binary files, a recursive
grep of a directory tree containing a substantial percentage of
binary files will run slower than the find/xargs method if the
file type can be distinguished with a glob pattern because the
recursive grep will open and read a portion of each binary file
while the find/xargs method will skip these files.
The disadvantages of find/xargs are:
1) It's not obvious to new users.
2) It's cumbersome to type.
3) The standard versions of find/xargs are not safe and robust.
4) grep -R probably is more efficient than find/xargs if you truly
want to grep everything in the tree, since its filesystem
references will have better locality.
Problem 3 can be fixed with the -print0 and -0 flags at the expense of
worsening problems 2 and 1.
I would propose that in addition to the -print0 and -0 flags, that
a flag be added to find that is similar to -exec, but which gathers
a number of file names together before running the command. This
would have the efficiency of xargs, but could be made safe since the
arguments could be passed directly to the exec() call rather than
being parsed, avoiding problems with nasty characters embedded in
the arguments. This enhancement to find would help a bit with
problem 2.
Unfortunately, neither of these fixes helps if you want to
find whatever -print | filter | xargs command
since most off the shelf filter commands don't operate on NUL delimited
strings.
--- Truck