mjd-perl-pm on 11 Jan 2004 00:06:40 -0000 |
Consider the following code: # Code A my @filenames = readdir D; @sorted = sort { -M $b <=> -M $a } @filenames; This is the obvious, natural way to sort a list of filenames into order by last modification date. For long lists, it can be slow, because the -M operation is done many times for each file. The Schwartzian Transform is a method for speeding this up, by reducing the number of redundant -M operations: # Code B my @filenames = readdir D; @sorted = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [$_, -M $_] } @filenames; I was teaching this technique in a class last month, and one of the students asked why one wouldn't simply use this: # Code C my @filenames = readdir D; my %date; $date{$_} = -M $_ for @filenames; @sorted = sort { $date{$b} <=> $date{$a} } @filenames; undef %date; Compared with B, code C has the benefit of being much simpler and easier to understand. I think it's an obvious, natural way to solve the problem, and that B isn't. The obvious drawback of C compared with B is that it's fractionally slower; about 5%. I don't think the 5% is significant, and I don't think other people think it's significant either. As evidence of this, I note that people don't usually suggest that B be replaced with the following variation, which is somewhat faster than B: # Code D my @filenames = readdir D; @sorted = map $_->[0], sort { $b->[1] <=> $a->[1] } map [$_, -M $_], @filenames; So here's the question: why do we bother with B at all? Why not always use C? - **Majordomo list services provided by PANIX <URL:http://www.panix.com>** **To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**
|
|