JP Vossen on 1 Dec 2008 15:13:33 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] RSS readers and Googling locally

 > Date: Sun, 30 Nov 2008 23:59:29 -0500
 > From: jeff <>

 > The second item on our hit parade is a device/appliance/software to
 > crawl the corporate intranet and provide a Google-like service.
 > Google has signaled their willingness to *sell* us an appliance for
 > this but I don't exactly trust that particular vendor.
 > Unfortunately I don't know what this is called.  I did some searches
 > under content/knowledge management and came up dizzy.  Some of that
 > software looked like it required its own team to implement and run.
 > In terms of ease of use, is there any hope for an open source solution
 > or am I better off just letting them go Google?

I spent a lot of time looking for that in early 2007.  What I found was 
mostly Swish-e, Nutch and Lucene, but they are all basically engines 
that require you to roll-your-own solution.  The best I found at the 
time was the Google appliance, which was not acceptable for various 
reasons including lack of trust and no budget.

Just found this, looks interesting:

Here are my links from 2007:
*	Best Way to 
Build a Searchable Document Index? (2007-10-01)
Search Engines for Your Intranet or Small Business? (2005-05-18)
*	Swish-e (MS Office, PDF, etc. also)
*	ht://Dig (web-based HTML and text only)
*	Mnogosearch
*	Nutch
*	Live use of Nutch, looks just like Google
Slashdot book review: Lucene in Action
*	Namazu
*	Xapian is an Open Source Search Engine Library

Perhaps some of these projects have advanced since I looked, though I 
quickly glanced and some and htdig is circa 2004 and Nutch is still 
2007, though the Lucene engine is newer than that.

See also:

Maybe another Ask Slashdot is in order, if this group has nothing 
better?  (Also, I'd love to hear your resolution and maybe see a preso 
on it.)

Good luck,
JP Vossen, CISSP            |:::======|        jp{at}jpsdomain{dot}org
My Account, My Opinions     |=========|
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
