|
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
> Date: Fri, 10 Oct 2008 16:53:51 -0400
> From: "Michael Lazin" <microlaser@gmail.com>
>
> What I am doing is picking up the IP addresses, my problem seems to be
> finding a "||" immediately following the IP address. What I tried
> seems to be picking up a | followed by a string and another | followed
> by a string. What I am looking for is [IP address][||]
Other folks already mentioned the various problems with the IPA part of
the pattern, and suggested [[:digit:]] or [0-9] solutions, so I'll skip
that. (OK, I lied, I won't. You have been warned.)
If I understand, you've got a file with '|' as a delimiter and want to
find an IPA with a *single '|' on each end? So you have two problems,
a) how to find an IPA (kinda solved) and b) how not to find an IPA
ending with '||'.
If that's right, how about this:
$ cat grepme.txt
foo|bar|10.10.10.10|baz|match
abc|efg|10.10.10.20||No match
$ egrep '\|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\|[^|]'
grepme.txt
foo|bar|10.10.10.10|baz|match
In PCRE (Perl Compatible Regular Expression) terms, you want a negative
look-ahead to find /|/ but not /||/, but I don't think that's
implemented in egrep. However you can fake it with a negated character
class, which is what I did. So [^|] or [!|] means any character that
isn't a pipe.
The definitive book for regular expressions is _Mastering Regular
Expressions 3_ (AKA MRE,
http://oreilly.com/catalog/9780596528126/index.html), which is an
incredible book. But it's also incredibly dense, bring aspirin. :-)
So, from my copy of MRE2, page 189, here is the regex that only matches
an IPA. This may well be vast overkill for your need, since the
patterns above are usually Good Enough. I throw it in here to
demonstrate how good and how dense MRE is.
In PCRE it is:
([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])
In egrep it is (unreadably):
$ egrep
'([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\|[^|]'
grepme.txt
foo|bar|10.10.10.10|baz|match
Still with me? :-) Be darn sure you comment the heck out of your code!
Just for fun, the GNU grep -o switch:
$ egrep -o '\|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\|[^|]'
grepme.txt
|10.10.10.10|b
$ egrep -o '\|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\|[^|]' \
grepme.txt | cut -d'|' -f2
10.10.10.10
Finally, on re-reading, I think maybe I got it wrong and you *want* only
the double pipe. In that case:
$ egrep '\|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\|\|' grepme.txt
abc|efg|10.10.10.20||No match
Bonus, if you want to sort the resulting list of IPAs, I cover that in
recipe 8.3 of the _bash Cookbook_:
... | sort -t . -k 1,1n -k 2,2n -k 3,3n -k 4,4n
HTH,
JP
PS--No one talked about anchors, which can *vastly* speed up execution
time. But I don't have enough info about your data file to tell if they
are useful or not. Briefly, ^ is start of line (^foo), $ is end of line
(bar$) and there are lots of others. Search the docs for 'anchor'.
PPS--I'm sorry if attempting to read this on a Friday evening makes
anyones head explode. Don't blame me, I didn't invent regular
expressions. I just use 'em every day. :-)
----------------------------|:::======|-------------------------------
JP Vossen, CISSP |:::======| jp{at}jpsdomain{dot}org
My Account, My Opinions |=========| http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug
|
|