Re: [PLUG] OCR in linux

Walt Mankowski on 22 Feb 2005 19:22:28 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] OCR in linux

From: Walt Mankowski <waltman@pobox.com>

To: plug@lists.phillylinux.org

Subject: Re: [PLUG] OCR in linux

Date: Tue, 22 Feb 2005 14:22:28 -0500

Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>

Sender: plug-bounces@lists.phillylinux.org

User-agent: Mutt/1.5.6+20040907i

On Tue, Feb 22, 2005 at 11:37:55AM -0500, Gregson Helledy wrote: > I have a .pdf file which I'd like to convert to text. > I apt-got a package called gocr (and gocr-gtk, a frontend). > gocr wants .pbm files, so I converted the .pdf to .pbm with > ImageMagick, then used gocr on it. > > What I got from my 25-page .pdf file (which is text, not images) > was a 1.7K text file of garbage. I am using Libranet, targeted > at Debian stable. > > I know nothing about how ocr software works and thought I'd ask > whether anyone has had luck with this package. One thought > that occurred is that the creator of the document put in images > of text, rather than text itself...is that possible? Have you tried pdftotext? It's part of the xpdf package. Walt
Attachment: signature.asc
Description: Digital signature

___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

References:

[PLUG] OCR in linux
From: "Gregson Helledy" <gregsonh@gra-inc.com>

Prev by Date: Re: [PLUG] OCR in linux

Next by Date: Re: [PLUG] assign dhcp address using dhcp-client-identifier?

Previous by thread: Re: [PLUG] OCR in linux

Next by thread: Re: [PLUG] OCR in linux

Index(es):

Date

Thread