Gay, Jerry on 12 Jan 2004 14:52:17 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

RE: What use is the Schwartzian Transform?


> The Schwartzian Transform is a method for speeding this up, by
> reducing the number of redundant -M operations:
> 
>         # Code B 
>         my @filenames = readdir D;
>         @sorted = map { $_->[0] }
>                   sort { $b->[1] <=> $a->[1] } 
>                   map { [$_, -M $_] } @filenames;
>         
and it's a perl idiom. you can speed up your code, while expecting a
reasonable number of perl programmers to understand this without much more
than a glance at the "map-sort-map" structure. therefore, you've increased
efficiency of your code, without degrading readability.

>         # Code C
>         my @filenames = readdir D;
>         my %date;
>         $date{$_} = -M $_ for @filenames;
>         @sorted = sort { $date{$b} <=> $date{$a} } @filenames;
>         undef %date;
> 
> Compared with B, code C has the benefit of being much simpler and
> easier to understand.  I think it's an obvious, natural way to solve
> the problem, and that B isn't.
> 
[snip]
> So here's the question:  why do we bother with B at all?  Why not
> always use C?
> -
why use C? it's more verbose, and for me, it's not as easy to read as the
idiom i've used many times. hey, and you say it's slower, so why bother? i
use the schwartz whenever i need a simple and maintainable custom sort
operation and i don't care about performance. 

if i need speed, however, i'll use the built-in sort, which is much speedier
than a custom sort. for example:

    # Code E
    my @filenames= readdir D;
    my @sorted= @filenames[
        map { unpack 'N' => substr $_, -4 }
        sort
        map { ( -M $filenames[$_] ) . pack 'N', $_ }
        0 .. $#filenames
    ];

i know this *looks* a lot like the schwartzian transform (ST) to those just
introduced to the idiom, but it has several important differences that make
it faster in most cases:
* this method pre-computes the sort keys, like the schwartzian transform,
but does not have the overhead of array-ref/hash creation per key, as it
packs both the sort key and the data into a scalar (saving memory, and time)
* it uses the system sort, which is highly optimized on all systems
* it's a stable sort, meaning records with the same key are kept in the same
order they started in (this may or may not be important, depending on need)

see tye mcqueen's post describing this technique at
http://perlmonks.org/index.pl?node_id=287149


--jerry



************************************************************************** 
This e-mail and any files transmitted with it may contain privileged or 
confidential information. It is solely for use by the individual for whom 
it is intended, even if addressed incorrectly. If you received this e-mail 
in error, please notify the sender; do not disclose, copy, distribute, or 
take any action in reliance on the contents of this information; and delete 
it from your system. Any other use of this e-mail is prohibited. Thank you 
for your compliance.



-
**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**