Re: [PLUG] archiving websites

McLinux on 23 Mar 2004 05:43:02 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] archiving websites

From: McLinux <mclinux@myrealbox.com>

To: plug@lists.phillylinux.org

Subject: Re: [PLUG] archiving websites

Date: Mon, 22 Mar 2004 15:29:17 +0530

Reply-to: plug@lists.phillylinux.org

Sender: plug-admin@lists.phillylinux.org

User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040116

Try out HTTrack, Cool Stuff, but this won't help you in generating dynamic pages.

HTH.

-McLinux Jeff Abrahamson wrote:

On Sun, Mar 21, 2004 at 09:35:08PM -0500, M. Jackson Wilkinson wrote:

[23 lines, 142 words, 1035 characters] Top characters: e_itnsol

Hey everyone,

The college for which I work is redesigning their website, and in the process wants to archive their current site for posterity's sake. Since all of the pages are dynamically-generated, it doesn't make much sense from an archival standpoint to simply copy the web tree to disk, and we want to find a way to archive the site as it's generated.

Have any of you been in a similar situation and found a solution? We want something flexible so we can say "start at this URL and go 3 levels deep, but don't archive jpgs and gifs" and modify those parameters as is appropriate.

Heretrix looks like it could be in the right direction, but it clearly isn't ready yet...

man wget

I do this often enough that I made an alias:

jeff@asterix:jeff $ type mirror mirror is a function mirror () { echo fast mirror; wget --no-host-directories --convert-links --mirror --no-parent $1 } jeff@asterix:jeff $

___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

References:

[PLUG] archiving websites
From: "M. Jackson Wilkinson" <jackw@jounce.net>

Re: [PLUG] archiving websites
From: Jeff Abrahamson <jeff@purple.com>

Prev by Date: Re: [PLUG] Re: SPF

Next by Date: [PLUG] Re: Mandrake Linux 10 web browsing much slower than windows.

Previous by thread: Re: [PLUG] archiving websites

Next by thread: Re: [PLUG] Why does linux browse at one tenth the speed of Windows XP?

Index(es):

Date

Thread