Art Alexion on 22 Nov 2008 18:16:52 -0800 |
On Sat, Nov 22, 2008 at 3:06 PM, JP Vossen <jp@jpsdomain.org> wrote: > > Date: Sat, 22 Nov 2008 13:32:17 -0500 > > From: Art Alexion <art.alexion@gmail.com> > > > > I have a directory with a lot of files, a number of which are > > identical, except for filename. What is the most efficient way to > > find (and ultimately delete) the duplicates? > > How about this? TEST, TEST, TEST first! > > # Assumes a recent version of bash [for nested $()] > # BACKUP, then capture md5 [1] hashes (don't put the output file in your > CWD or you may recurs!) This is kubuntu 8.04, which I understand uses dash instead of bash. Should I be OK? > > > ~~~~~~~~~~~~~~~~~~~~~~ > Interesting commands: > > * cut: -d' ' uses space as the delimiter, -f3- for fields 3 to the end > * uniq: -d shows only duplicated lines (hashes) > * tail: -n+2 starts at line 2 and goes to the end (i.e., skips line 1) > * $(): sub-shell, legacy as backticks ``, but those are harder to read > and not nestable. I've nested here. > * for...done: Takes each hash and greps for it, then give you just the > file part > > > This is a good one, I'll add it to the second edition of the _bash > Cookbook_, if/when. Let me know how you make out. I'll give it a try. If things go awry, the backed up copy of the directory that the first couple of steps accomplish should assure I am OK. > > Later, > JP Thanks. -- -- artAlexion sent unsigned from webmail interface ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
|
|