[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
RE: rtVAX300 .. need help..
Mouse [mouse%Rodents-Montreal.ORG@localhost] wrote:
> You can't grep images.
Obviously keeping the original is always a good idea (and storage is
pretty cheap these days). When I do OCR I generally add the text
to a copy of the original PDF so I can have some aspect of both worlds.
FWIW: I've never found OCR that's good enough that I'd dare to trust
it without the original image available (but it must be 3 or 4 years
since I tried to OCR anything).
> The VARM is a prime
> candidate for conversion to text, because even its diagrams
> are done as text (at least in my VARM; even in the one bqt
> pointed at, the only things I've seen so far that aren't text
> are the |d|i|g|i|t|a|l| logos - and the large text on the cover, if
> you count that).
The VARM was originally (and probably for its whole life) maintained as
a text file (or text files) and (sometimes) small chunks would be
within engineering. (Never to me though, so OCR will be the best you get
Main Index |
Thread Index |