George Theall on 5 May 2004 20:00:02 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Do no robots texts work?


On Wed, May 05, 2004 at 03:31:24PM -0400, kaze wrote:

> Inotherwords would it be correct to say that a malevolent harvester spider
> would ignore the robots.txt, but an engine built on or seeded by Google
> would 'honor' the robots text. 

I'm not sure what you mean by "an engine built on or seeded by Google". 
That said, I've never seen Google stray beyond exclusions in a site's
robots.txt.

> (More convoluted still, might a robots.txt
> expose you more as some would search just for them figuring there is
> something hidden?)

I maintain a few small sites, monitor my logs pretty closely, and have a
couple of traps for bad robots, including a bogus setting in my
robots.txt files telling robots not to visit a non-existent area of my
webs.  While I do find plenty of examples of 'bots that completely
ignore restrictions in robots.txt, I can't recall the last time I saw
anything try to visit that non-existent area. 

George
-- 
theall@tifaware.com

Attachment: pgptsfcksIpVf.pgp
Description: PGP signature