Susie J on Mon, 5 Feb 2001 20:32:18 -0500


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Beginner's problem, maybe with grep()


> Do you know about the Web Robots Database?

No, I didn't. However, doesn't seem to be up to date. Google is listed as
pre-1.0, and is showing up as 1.2 in my logs.

> (http://info.webcrawler.com/mak/projects/robots/active.html)  That might
> be a better solution for you, because I might choose to view your
> robots.txt if I'm curious or nosy.  ;)  And then you'd have "Mozilla/4.72
> [en] (X11; U; Linux 2.2.18 i686)" (or thereabouts, depending on which
> machine I'm using) listed as a user agent...

I _knew_ that code section needed a disclaimer :). Yes, yes, yes, and not
all crawlers look at robots.txt (like the email harvesters). In general,
crawlers don't include a referrer, instead issuing a direct request. For
you, the check has expanded to be (on the referrers list) and (direct
request).

Sue

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Christmas Baking with Susiej             All Christmas. All baking. All year.
susiej@christmas-baking.com                          www.christmas-baking.com
**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**