Walt Mankowski on 31 Aug 2010 11:46:00 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Sed or awk help?!


On Tue, Aug 31, 2010 at 01:49:00PM -0400, Daniel.Roberts@sanofi-aventis.com wrote:
> Actually..
> Here is a small snippet of the huge file that I am trying to process
> 
> Note that the filename is always in the fifth column of a tab delimited
> file...and the filename itself could contain periods....but it will
> always end in the ".CEL" filename extension which is what I am looking
> to remove...
> The filename has many different conventions, it may contain any
> combination of numbers and letters, but always ends in a .CEL file name
> extension...
> So If I could re-write the same file w/o the .CEL extensions that would
> be great!
> Dan
> 
> 
> 
> 10	3EDD188D-91D3-4104-8992-E12D4B5F4785	3242
> AFFY_LIMS_DATA_OLD	012799Kas19KA85305_26.CEL
> \\DGMappafs01\archivedata\1999
> 11	3EDD188D-91D3-4104-8992-E12D4B5F4785	3243
> AFFY_LIMS_DATA_OLD	012799Kas19KA85305_33.CEL
> \\DGMappafs01\archivedata\1999

I'll assume those lines wrapped and what's above is really just two
lines, one beginning with 10 and one with 11.  If so, here's one way
to do it in perl:

perl -ane '$F[4] =~ s/\.CEL$//; print join "\t", @F; print "\n"' oldfile.txt >newfile.txt

If you have perl 5.10 or later, you can shorten that to

perl -anE '$F[4] =~ s/\.CEL$//; say join "\t", @F' oldfile.txt >newfile.txt

Walt

Attachment: signature.asc
Description: Digital signature

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug