Rich Freeman on 5 May 2012 14:23:47 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Managing PDF Files on Linux?

On Sat, May 5, 2012 at 3:23 PM, Michael Dur <> wrote:
> If so then you could make a script that sorts them according to pdf2text
> output without viewing them.

Well, pdf2text won't do anything since there is no text in the files
to begin with.  Tesseract works passably.

I can already ocr them fine (well, as well as open-source ocr goes),
which I plan to use for some basic grep functionality, but I can't
rely on this as a sole method of retrieval. If I should end up with a
dead car engine at 45k miles it would be nice if I could prove I've
serviced it at every interval without rereading every receipt I've
ever scanned, assuming I never throw the wrong ones out by mistake.

I really need to visually verify them...

But this is the issue I'm running into. There are 47 ways of
digitizing things, and zero ways of keeping track of them without
"just writing some scripts."

