Art Alexion on 14 Jun 2007 16:36:27 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] pdf to text


On Thursday, 14 June 2007 12:07, Jon Nelson wrote:
> On Thu, 2007-05-31 at 09:28 -0400, Art Alexion wrote:
> > I need to make comments on a PDF I received.  I first tried using
> > pdftotext, but found that the text in PDF was a scanned image, so
> > pdftotext found no text to convert.
> >
> > I am thinking of two possible solutions -- neither of which may exist.
> >
> > Many years ago, I used win-based fax software that scanned fax images for
> > text and did an OCR.  Does anything like that -- the ability to scan and
> > OCR a pdf or extracted image -- exist for Linux?
> >
> > The other alternative is the ability to add comments as is possible with
> > the full acrobat windows/mac product.  Are there any tools available for
> > Linux that allow commenting?


> I have used scribus (http://www.scribus.net) in the past and has not
> failed me yet.

This sounded promising as PDFEdit seems to have trouble saving the edited 
files.  Unfortunately, for me, it failed.  The documents that I needed to 
edit are contracts.  The person who sent them to me faxed them to himself and 
then saved the fax images as PDF.  The result is that the PDF file has no 
text, only images of text.  Perhaps this is the problem with scribus.  
Anyway, scribus does not seem to import a PDF directly.  I first had to 
convert it using pdf2ps, and then import the resulting postscript file.  
Scribus could not import the images from the postscript file.

For this particular document, I used PDFEdit and the workaround I mentioned 
earlier.  I think the best solution in the future is to extract the images 
using pdfimages and then paste them into OpenOffice as suggested by Bhaskar.

-- 

_____________________________________________________________
Art Alexion

PGP fingerprint: 52A4 B10C AA73 096F A661  92D2 3B65 8EAC ACC5 BA7A
Keyserver: hkp://subkeys.pgp.net
The attachment - signature.asc - is my electronic signature; no need for 
alarm.  Info @ 
http://mysite.verizon.net/art.alexion/encryption/signature.asc.what.html
_____________________________________________________________

Attachment: pgpqKiNAip9j1.pgp
Description: PGP signature

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug