Subject: Re: GNU "diff" considered damaged
To: Mike Cheponis <mac@Wireless.Com>
From: Greg A. Woods <woods@weird.com>
List: netbsd-help
Date: 07/13/2001 03:06:40
[ On Thursday, July 12, 2001 at 13:30:01 (-0700), Mike Cheponis wrote: ]
> Subject: GNU "diff" considered damaged
>
> Why are we using the braindead GNU diff? Can't we get one that acutally
> works?
If you can find one that's less brain-dead I for one would be very
happy! Unfortunately GNU Diffutils are by and far the most complete and
most usable freely available implementation of UNIX Diff that I know of.
I think everyone who was working on free implementations prior to the
appearance of GNU diff on the scene pretty much either gave up or put
their own efforts into helping with GNU diff.
Note that the original UNIX implementation, even the most recent version
of it, still has equally manny "braindead" aspects, if not more. Most
people look at GNU Diffutils as one of the best implementations, free or
otherwise.
In other words I think the readme from GNU Diffutils is basically right:
This directory contains the GNU diff, diff3, sdiff, and cmp utilities.
Their features are a superset of the Unix features and they are
significantly faster. [[...]]
As may have already been mentioned, please note also this blurb from the
"Shortcomings" node in the Texinfo manual:
Handling Files that Do Not Fit in Memory
----------------------------------------
`diff' operates by reading both files into memory. This method
fails if the files are too large, and `diff' should have a fallback.
One way to do this is to scan the files sequentially to compute hash
codes of the lines and put the lines in equivalence classes based only
on hash code. Then compare the files normally. This does produce some
false matches.
Then scan the two files sequentially again, checking each match to
see whether it is real. When a match is not real, mark both the
"matching" lines as changed. Then build an edit script as usual.
The output routines would have to be changed to scan the files
sequentially looking for the text to print.
--
Greg A. Woods
+1 416 218-0098 VE3TCP <gwoods@acm.org> <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>