Mark Dominus on 10 Oct 2003 17:24:42 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Perl source filter question


Mike Cramer:
> I'm looking for something along the lines of a perl source filter that 
> will take a source file -- cgi script, perl module, whatever -- and 
> without actually *running* the program, will list all of the functions 
> called. 

In general, the only way to know what a program will 'do' is to run
it.  That's why we run programs: to find out what they will do.  If
there were a reliable way to find out the behavior of a program
without running it, then why would we ever actually run any programs?

Here's a Perl example that demonstrates one of the many man potential
problems:


        use Astro::MoonPhase 'phase';
        
        $phase = phase(time);
        if (0.4 < $phase && $phase < 0.6) { whee() }
        
        sub whee { ... }


This program calls 'whee' only when the moon is full.

But we can make the problem worse:

        use Astro::MoonPhase 'phase';
        
        $phase = phase(time);
        if (0.4 < $phase && $phase < 0.6) {
          open F, "whee.pl";
          my $whee = join "", <F>;
          $whee =~ tr/A-Za-z/N-ZA-Mn-za-m/;
          eval $whee;
        }

Now if the moon is full, the program reads code from the external file
'whee.pl'; decrypts it, and executes it.  At the time the main program
is compiled, you can't even know what functions might be defined by
whee.pl, since it might not exist yet, so you can't even report on
what functions *might* be called by this program.

Eric Roode said it was exceptionally difficult to solve this, but that
wasn't strong enough.  It is impossible.  There is a mathematical
proof that there's no solution; it's called "Rice's theorem".  But the
content of the theorem is completely unsurprising: there are no
shortcuts to computation; if you want to know the result of computing
something, you actually have to compute it.

Now, all that said, let's see what we can do for you.

There's a big technical difficulty with your question, since from
Perl's point of view, built-in functions like 'print' and 'sleep' are
not the same kind of thing as used-defined functions like 'CGI::new'
and 'CGI::header'.  And, even worse, 'use' and 'my' are not functions
at all.  It's not even clear to me why you want the output to include
'use' and 'my', but not '->' (which is known internally inside Perl as
'pp_method') or '==' or 'qq' or '='.

To get all the user-defined functions at compile time, we can use a
source filter that looks for 

        sub XXXXXX {

and replaces it with

        sub XXXXXX {
          $FUNCTIONS::HASH{XXXXXX} = 1;

I have code for that handy if you want to see it.

To get all the user-defined functions at run time (but we have to run
the program) probably the best thing is to write a drop-in debugger
module that intercepts each function call and records its use in a
hash.  I wrote one of these a couple of weeks ago when I was working
on some modules and I wanted to throw away all the functions I wasn't
actually using.

To get the internal operators like 'print' and 'sleep', we'd need to
do two things.  First, we need you to make a list of what counts and
what doesn't, because it isn't clear.  (Does 'not' count?  Does '-x'
count?  Perl is full of weird edge cases like this.)  Then probably
the best thing to do is to write a new B:: module that traces over the
op tree and prints out the relevant operators when it encounters them,
or else to filter the output of B::Terse or B::Concise similarly.  But
this won't give you completely correct answers, because of code like:

        my $code = 'sleep 37';
        eval $code;

The B:: modules can't see at compile time that the code will run the
'sleep' operator at run time.  The only way to find out what a program
will do if you run it, is to actually run it.  If there were a
shortcut to computation, then we wouldn't actually need to compute,
and then we'd all be out of a job.

I hope this was more helpful than frustrating.  It would help to know
more about what you are actually trying to accomplish and why.
-
**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**