Jeff Abrahamson on 26 Feb 2004 04:31:02 -0000 |
On Wed, Feb 25, 2004 at 03:33:28PM -0500, Paul.L.Snyder@gsk.com wrote: > [68 lines, 393 words, 2399 characters] Top characters: etoalish > > "gabriel rosenkoetter" <gr@eclipsed.net> wrote on 02/25/2004 08:23:23 AM: > > > On Wed, Feb 25, 2004 at 01:52:57AM -0500, Paul wrote: > > > Where are the images stored? Can't you tell wget to just grab *.JPG > files? > > How would you go about finding out the urls for the "*.JPG files"? > > Note that wget explicitly does NOT parse the files it retrieves. > > ("Paul", above, is <gyoza@comcast.net>, not me.) > > Here's a four-line perl filter that will de-javascript > a .mac photo album web page piped to it. > > I googled for a random .mac photo album, and found the following > link, which I used to test the script: > > http://homepage.mac.com/toj/PhotoAlbum6.html > > Here's the de.mac filter: Thanks, Paul, that's excellent. Of course, then, I couldn't just use it as is, I ended up tweaking and fiddling to get just what I wanted. I now have the attached script, which admittedly bares little resemblance to the fine script you presented, but does just what I want. -- Jeff Jeff Abrahamson <http://www.purple.com/jeff/> GPG fingerprint: 1A1A BA95 D082 A558 A276 63C6 16BF 8C4C 0D1D AE4B #!/bin/sh my $copyright = " de.mac, Copyright (C) 2004 Jeff Abrahamson de.mac comes with ABSOLUTELY NO WARRANTY; in particular: This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. See also <http://www.fsf.org/copyleft/gpl.html> Usage: de.mac URL [directory] "; # Grab large images from a .Mac page # First argument is URL of .Mac web page # Second argument is directory in which to store them url=$1 if [ "X$url" = "X" ]; then echo $copyright exit fi dir=$2 if [ "X$dir" = "X" ]; then echo No directory specified, using current directory. dir=. fi mkdir -p $dir curl -s $url | grep 'new Slide' | \ perl -pwe "s/^.*new Slide\('//; s/'.*$//;" | sort -u | \ xargs wget --quiet --no-directories --no-host-directories \ --directory-prefix=$dir Attachment:
signature.asc
|
|