Mouse [mouse%Rodents-Montreal.ORG@localhost] wrote:

> You can't grep images.

Obviously keeping the original is always a good idea (and storage is
pretty cheap these days). When I do OCR I generally add the text
to a copy of the original PDF so I can have some aspect of both worlds.

FWIW: I've never found OCR that's good enough that I'd dare to trust
it without the original image available (but it must be 3 or 4 years
since I tried to OCR anything).

>                                       The VARM is a prime
> candidate for conversion to text, because even its diagrams
> are done as text (at least in my VARM; even in the one bqt
> pointed at, the only things I've seen so far that aren't text
> are the |d|i|g|i|t|a|l| logos - and the large text on the cover, if
> you count that). 

The VARM was originally (and probably for its whole life) maintained as
a text file (or text files) and (sometimes) small chunks would be
within engineering. (Never to me though, so OCR will be the best you get
for now).


