Mark Dominus on Sat, 28 Jun 2003 12:58:23 -0400


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Amusing trivial program


        #!/usr/bin/perl

        if (! @ARGV && -t STDIN) {
          @ARGV = qw(/usr/local/apache/logs/perl-access_log);
        }

        while (<>) {
          chomp;
          my ($target, $referrer) = (split)[6,10];
          next unless $referrer =~ /google\.com/;
          $referrer =~ s/"$//;
          my ($q) = $referrer =~ m/(?<=[&?])q=([^&]*)/; # >)/;
          next unless defined $q;
          unescape($q);
          $ref{$target}{lc $q}++;
        }

        for my $target (sort keys %ref) {
          print $target, "\n";
          for my $ref (sort keys %{$ref{$target}}) {
            printf "\t%4d %s\n", $ref{$target}{$ref}, $ref;
          }
        }

        sub unescape {
          $_[0] =~ tr/+/ /;
          $_[0] =~ s/%([0-9a-fA-F]{2})/chr hex $1/ge;
        }



Given my apache httpd log file, the program above produces output
of several sections of the form:

        /yak/dirty/
                   3 dirty stories
                   1 how to dirty talk
                   1 perl regex

which tells us that yesterday, one person arrived at
http://perl.plover.com/yak/dirty/ after doing a Google search for
'perl regex', one after searching for 'how to dirty talk', and three
after searching for 'dirty stories'.  That is, at least four of these
five people were disappointed.

On the other hand, the corrsponding section for

        /obfuscated/
                   3 obfuscated perl code
                   2 obfuscated perl contest

suggests that those people probably got what they wanted, or something
like it.

Perusal of the output is fun and enlightening.  

-
**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**