JP Vossen on 23 Oct 2012 13:00:25 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Perl one-liner to remove duplicates without changing file order


Date: Mon, 22 Oct 2012 17:15:00 -0400
From: Frank Szczerba<frank@szczerba.net>

I'm way behind on emails, so apologies if someone else has already
mentioned it, but the -i doesn't work because you are printing your
output in an END block. If you give perl -i multiple files to process,
it will edit each one in place separately, but the END block doesn't run
until after all files have been processed.

You can make this work with -i by doing:

$ perl -i -ne '$line{$_} = $.; if (eof) { for (sort{$line{$a}<=>$line{$b}} keys %line) {print} }'

That's awesome! I knew there had to be a pure-Perl way to do it, but I wasn't seeing it.


If you really are processing multiple files, you probably also want
to  clear %line in between files:
>
$ perl -i -ne '$line{$_} = $.; if (eof) { for (sort{$line{$a}<=>$line{$b}} keys %line) {print} %line = () }'

I wasn't, but probably better safe than sorry! I tested and the first form *does* mangle your input files. Or rather, it keeps appending to them, as may be obvious since the hash is never cleared between files.

Thanks!
JP
----------------------------|:::======|-------------------------------
JP Vossen, CISSP            |:::======|      http://bashcookbook.com/
My Account, My Opinions     |=========|      http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug