JP Vossen on 1 Feb 2009 15:47:08 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] date formatting in a shell script


 > Date: Sat, 31 Jan 2009 13:11:15 -0500
 > From: Chad V <csv@gamebox.net>

[...]
 > INFO 	20080501043217440	00000000000000000005	....
 > INFO 	20080501043219725	00000000000000000006	....

 > The date looks like it includes:  YEAR MONTH DAY HOUR MINUTE SECOND
 > MILLISECOND (last 3 digits).
[...]
 > Any ideas on how to get the string into the format I want?  I don't
 > want to do simple substitution (i.e. 2008  =  2008-, 05 = 05-) because
 > it should be generic enough to work on any date string in this screwy
 > format.

I saw the other answers for this but thought I'd throw this one out 
there anyway.  I usually use Perl Regular Expressions (PCRE) for that 
kind of thing.  Sure you can use 'awk' or (sometimes) 'cut', but I just 
find Perl easier to read (and how scary is that?!?).

Here is a trivial and slightly inefficient example you can copy&paste 
and play with:

$ echo 20080501043217440 | perl -pe 
's/(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})(\d{3})/$1-$2-$3T$4:$5:$6.$7/;'
2008-05-01T04:32:17.440

It's inefficient because the expression isn't anchored.  Something like 
'^INFO (\d{4})(\d{2})...' might work, if all the lines are the same.  On 
the other hand, the unanchored version will get the date string anywhere 
in the line (actually the first 17 digit string).

The use of s///; instead of m//; is important.  If you try to match 
(m//;) and print, any lines that fail your match will not be printed. 
That's usually bad since you will silently corrupt your data.  Using 
s///; avoids that since you get the substitution you want for any lines 
that match, while any (odd/unexpected) lines that don't match are 
printed as-is (not changed, but not missing either).  (Also needs the 
perl -p switch, as opposed to the -n you might use in a match&print 
one-liner.)  So if you get output lines that aren't "fixed" you can 
notice it and fix it, as opposed to having them silently dropped.

That is bit of a subtle point, but it's important, I hope I explained it 
clearly.  Regular expressions and efficiency are a bit out-of-scope for 
this message.  Get _Mastering Regular Expressions 2nd (or 3rd?)_, some 
aspirin and the old version of "The Regex Coach" that runs on Linux (or 
try the newer one in Wine, but I haven't gotten it working yet.)

I know you already solved this, so FYI,
JP
----------------------------|:::======|-------------------------------
JP Vossen, CISSP            |:::======|      http://bashcookbook.com/
My Account, My Opinions     |=========|      http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug