[Leaplist] Scanner Setup Help?

tonyb at abcc.linuxceptional.com tonyb at abcc.linuxceptional.com
Wed Jul 15 13:52:35 EDT 2009


You may want to give http://code.google.com/p/tesseract-ocr/ a try.

I have to disagree with Steve on scanning at a high resolution.  My  
experience scanning low resolution documents, e.g. newspapers and  
magazines, is that it is best to find a scanner resolution that  
matches the document.  Too high a resolution will introduce unwanted  
distortion.  I have used 75dpi with success.  YMMV.

Cheers,

Tony

Bruce Metcalf wrote:
> Gang,
>
> I need to be able to scan, and hopefully OCR, large amounts of text --
> hundreds to thousands of pages -- in batches of 12 to 50 pages. Tools
> available are several Debian boxen and an HP All-in-One with a sheet
> feeder that connects through a USB port.
>
> I need to be able to stack up a sheaf of papers in the document feeder
> and load them all in a batch, then OCR them in a batch as well (or
> simultaneously, whatever).
>
> I've been testing things like OCRAD, Clara, and Kooka without
> satisfactory results, but that could just be me. (Their instructions are
> remarkably well hidden, if extant.) So far, the fastest approach is
> retyping, which I find philosophically unacceptable.
>
> Question 1: Does anyone have any suggestions about either specific
> software or a web site "how-to" for this?
>
> Question 2: If I can make it to Saturday's Hackfest, would anyone be
> willing to carry in the AIO (I've got a 10 pound limit post-surgery) and
> help me find a way to make this work?
>
> Question 3: Failing both of the above, any suggestions about how to do
> the same using a network scanner w/o a sheet feeder?
>
> TIA,
> Bruce
>


More information about the Leaplist mailing list