Re: [PLUG] finding duplicate files

Stephen Gran on 22 Nov 2008 11:13:42 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] finding duplicate files

From: Stephen Gran <steve@lobefin.net>

To: plug@lists.phillylinux.org

Subject: Re: [PLUG] finding duplicate files

Date: Sat, 22 Nov 2008 19:13:32 +0000

Mail-followup-to: plug@lists.phillylinux.org

Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>

Sender: plug-bounces@lists.phillylinux.org

User-agent: Mutt/1.5.13 (2006-08-11)

On Sat, Nov 22, 2008 at 01:32:17PM -0500, Art Alexion said: > I have a directory with a lot of files, a number of which are identical, > except for filename. What is the most efficient way to find (and ultimately > delete) the duplicates? Offhand, I'd say write a little script that creates a data structure like: files = { hash_of_some_sort_A = (file1, file2, file3), hash_of_some_sort_B = (file4, file5), } and so on, where hash_of_some_sort is an md5sum, sha1sum, mashup of size/date-stamp/whatever you are calling 'identical' in this context. Then iterate over the list, for each item in it, pop the first member off the array and put it somewhere else, delete what's left. It looks like 10 minutes work in perl, but if you need help, give a shout. -- -------------------------------------------------------------------------- | Stephen Gran | Let's not complicate our relationship | | steve@lobefin.net | by trying to communicate with each | | http://www.lobefin.net/~steve | other. | --------------------------------------------------------------------------
Attachment: signature.asc
Description: Digital signature

___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

References:

[PLUG] finding duplicate files
From: Art Alexion <art.alexion@gmail.com>

Prev by Date: Re: [PLUG] finding duplicate files

Next by Date: Re: [PLUG] OT: soldering on the motherboard

Previous by thread: Re: [PLUG] finding duplicate files

Next by thread: Re: [PLUG] finding duplicate files

Index(es):

Date

Thread