Re: [PLUG] OCR in linux

Kuzman Ganchev on 23 Feb 2005 03:03:41 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] OCR in linux

From: Kuzman Ganchev <kuzman@sccs.swarthmore.edu>

To: plug@lists.phillylinux.org

Subject: Re: [PLUG] OCR in linux

Date: Tue, 22 Feb 2005 22:03:38 -0500

Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>

Sender: plug-bounces@lists.phillylinux.org

User-agent: Mutt/1.5.6+20040907i

On Tue, Feb 22, 2005 at 11:37:55AM -0500, Gregson Helledy wrote: > I have a .pdf file which I'd like to convert to text. > I apt-got a package called gocr (and gocr-gtk, a frontend). > gocr wants .pbm files, so I converted the .pdf to .pbm with > ImageMagick, then used gocr on it. What resolution was the output file in? Take a look at the image (e.g. using display) and see if it looks pixelated. OCR needs it to be pretty clear. Oh, there are also a lot of other packages. one list is at: http://www.linux-ocr.ekitap.gen.tr/ > I know nothing about how ocr software works and thought I'd ask > whether anyone has had luck with this package. I used it a long time ago, and it was OK but certainly not great. > One thought > that occurred is that the creator of the document put in images > of text, rather than text itself...is that possible? It is possible (esp. if the document was scanned), but that's what OCR is for. If it's not a scanned file pdftotext should be able to deal with it (much better than doing OCR). Kuzman
Attachment: signature.asc
Description: Digital signature

___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

References:

[PLUG] OCR in linux
From: "Gregson Helledy" <gregsonh@gra-inc.com>

Prev by Date: Re: [PLUG] OSX "middleware"?

Next by Date: RE: [PLUG] OSX "middleware"?

Previous by thread: Re: [PLUG] OCR in linux

Next by thread: [PLUG] OSX "middleware"?

Index(es):

Date

Thread