[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: rtVAX300 .. need help..
On 10/01/13 12:14 AM, Mouse wrote:
Snagged. Thank you for the pointer. Even without OCR, this is
going to be an interesting read.
I've never seen a document improved by OCR. Seen plenty ruined
beyond salvation. They should outlaw it.
The older a book is, the better typeset it is, and scanning it is the
Just imho as a typographer, of course.
You can't grep images.
Normally, my first operation on PDFs is to convert them to text files,
which text files I then use. The VARM is a prime candidate for
conversion to text, because even its diagrams are done as text (at
least in my VARM; even in the one bqt pointed at, the only things I've
seen so far that aren't text are the |d|i|g|i|t|a|l| logos - and the
large text on the cover, if you count that).
I would never suggest throwing away the original scans. But I would
suggest converting them to text anyway, for the sake of searching and
use in non-GUI interfaces.
Yes, that's a good idea, but I see too many books just OCR'd without any
originals, rendering them all but worthless.
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Main Index |
Thread Index |