TuskenTower on 29 Jul 2007 22:20:59 -0000 |
Jeff, I don't have a suggestion, but do you mind sharing your PERL and lex magic? I have looked at PDFs briefly to do some text extraction but couldn't find what I wanted. So I manually copied data out in 3 hrs. Amul On 7/29/07, Jeff Abrahamson <jeff@purple.com> wrote: > I want to extract the comments from a pdf. I know how to do it (in an > ugly way) with perl, if I assume no comment contains a > close-parenthesis. I know how to do it with lex if I don't make that > assumption, but I chose to make that assumption to avoid writing a > lexer. ;-) > > I'm curious if anyone knows an easier way than the above. The problem > is solved for this time, but I'd like to be more elegant about it next > time if I can. These are the comments that show up as yellow > highlights that you have to click on in acrobat reader but that xpdf > doesn't seem to know how to display. > > -- > Jeff > > Jeff Abrahamson <http://jeff.purple.com/> > phone: +33 06 21.83.26.20 (From U.S.: 011-33-6-2183-2620) > GPG fingerprint: 1A1A BA95 D082 A558 A276 63C6 16BF 8C4C 0D1D AE4B > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > > iD8DBQFGrQ3aFr+MTA0drksRAkWhAJ0civPXQysON2hCZDE3uab0raLk1ACbB/8Q > +LXr/Hj9pJ4FMD/4+ySn2f4= > =toB1 > -----END PGP SIGNATURE----- > > ___________________________________________________________________________ > Philadelphia Linux Users Group -- http://www.phillylinux.org > Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce > General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug > > ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
|
|