JP Vossen on 1 Dec 2008 15:13:33 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] RSS readers and Googling locally


 > Date: Sun, 30 Nov 2008 23:59:29 -0500
 > From: jeff <jeffv@op.net>

<snip>
 > The second item on our hit parade is a device/appliance/software to
 > crawl the corporate intranet and provide a Google-like service.
 > Google has signaled their willingness to *sell* us an appliance for
 > this but I don't exactly trust that particular vendor.
 >
 > Unfortunately I don't know what this is called.  I did some searches
 > under content/knowledge management and came up dizzy.  Some of that
 > software looked like it required its own team to implement and run.
 >
 > In terms of ease of use, is there any hope for an open source solution
 > or am I better off just letting them go Google?

I spent a lot of time looking for that in early 2007.  What I found was 
mostly Swish-e, Nutch and Lucene, but they are all basically engines 
that require you to roll-your-own solution.  The best I found at the 
time was the Google appliance, which was not acceptable for various 
reasons including lack of trust and no budget.

Just found this, looks interesting:
http://www.searchtools.com/tools/tools.html
http://www.searchtools.com/tools/tools-opensource.html


Here are my links from 2007:
* http://ask.slashdot.org/article.pl?sid=07/10/01/1959239	Best Way to 
Build a Searchable Document Index? (2007-10-01)
* http://ask.slashdot.org/article.pl?sid=05/05/18/152258&tid=230&tid=4 
Search Engines for Your Intranet or Small Business? (2005-05-18)
* http://swish-e.org/	Swish-e (MS Office, PDF, etc. also)
* http://www.htdig.org/	ht://Dig (web-based HTML and text only)
* http://search.mnogo.ru/	Mnogosearch
* http://lucene.apache.org/nutch/	Nutch
* http://search.oregonstate.edu/	Live use of Nutch, looks just like Google
* 
http://books.slashdot.org/books/05/08/24/1645211.shtml?tid=185&tid=95&tid=6 
Slashdot book review: Lucene in Action
* http://www.namazu.org/	Namazu
* http://www.xapian.org/	Xapian is an Open Source Search Engine Library

Perhaps some of these projects have advanced since I looked, though I 
quickly glanced and some and htdig is circa 2004 and Nutch is still 
2007, though the Lucene engine is newer than that.

See also:
http://en.wikipedia.org/wiki/Full_text_search
http://en.wikipedia.org/wiki/Web_search_engine
http://www.searchtools.com/
http://forums.whirlpool.net.au/forum-replies-archive.cfm/1015795.html
http://www.google.com/search?q=intranet+index+search+engine+%22open+source%22
http://www.google.com/search?q=intranet+search

Maybe another Ask Slashdot is in order, if this group has nothing 
better?  (Also, I'd love to hear your resolution and maybe see a preso 
on it.)

Good luck,
JP
----------------------------|:::======|-------------------------------
JP Vossen, CISSP            |:::======|        jp{at}jpsdomain{dot}org
My Account, My Opinions     |=========|      http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug