Rich Freeman on 6 May 2012 13:45:53 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Managing PDF Files on Linux? |
On Sun, May 6, 2012 at 2:23 PM, Jack Hill <ejh56@drexel.edu> wrote: > I haven't done this, but it sounds like the sort of problem symantic > desktops are targeted at solving. Maybe look at kde/dolphin/nepomuk > <http://nepomuk.kde.org/discover/user>. Ugh, I have USE="-semantic-desktop" set for a reason... :) Systems like this rely on being able to search the content of file - they usually don't involve tagging/browsing. I really need manual classification to ensure everything ends up being saved that needs to be. I can't really see typing into a search box "unimportant stuff" getting 500 results, and hitting delete without looking at them. If I had classified files into a "save for a year" folder I'd do the same without any worry. > Here is a linuxquestions.org thread > that looks like it might be relevant > <http://www.linuxquestions.org/questions/linux-software-2/need-a-file-based-tagging-file-organizer-926595/>. > Yeah, I've checked out a few of those options. Referencer seems like the best of these options, though it doesn't really match my use case per se. However, I can't seem to figure out where it is actually hosted. I've seen lots of ways to tag files, but none with integrated PDF viewers. I can already manually sort my files into directory trees, so that is really solving the part of the problem I already have a solution for. I just want to be able to run through 100 PDF files without double-clicking on each one, then re-selecting the file and dragging it to the right folder. > From your description of the problem it sounds like you have two files for > everything that you scan: the raster as a pdf and the OCRed text as plain > text. Is this correct? It might symplify organization to get the text > included in the pdf. That is probably worth a shot. If anybody has suggestions on how to do this I'm open to them. Googling around suggests that watchocr might be useful, but after a few minutes of clicking around I can't find the source to download. Just a bunch of useless .deb files. Odd - two projects that both seem promising but which seem to make it very hard to find the source tarballs... Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug