James E Keenan on 17 Nov 2016 19:12:33 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [Philadelphia-pm] selective splitting? |
On 11/17/2016 08:40 PM, Morgan Jones wrote:
mjd’s talk Monday has me thinking about peer review and how helpful it can be. So here goes. I can certainly work around this but as a learning experience I’m wondering if someone has a straightforward answer. Can I split on only instances of a character that is not surrounded by in this case parentheses? I have a semicolon separated string that contains a date, a string, an ip address and a user agent string. The catch is the user agent string contains a semicolon however it’s between parentheses. So what I want is to split on semicolons that are not surrounded by parentheses. For example: $v = ‘20161116172606Z;accepted-terms-of-use via CAS;192.168.1.5;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit/602.2.14 (KHTML, like Gecko) Version/10.0.1 Safari/602.2.14’; It seems to me I should be able to split like this: my ($date, $ignore, $ip, $agent) = split /[^\(]+[^\;]*\;[^\)]*[^\)]+/, $v; From a little reading I may need to use look aheads which are new to me. Here’s an attempt at that that is of course not working: my ($date, $ignore, $ip, $agent) = split /(?<!() \; (?!))/x, $v; Does anyone have a suggestion or see what I’m missing?
I often find that trying to write the One Regex to Rule Them All is a time-consuming experience that leads to unreadable code. Sometimes partially processing a string with one regex and then completing the processing with a second is more readable, even if not, in principle, as fast.
How far does this get you? my ($date, $ignore, $ip, $agent) = split /;/, $v, 4; Thank you very much. Jim Keenan _______________________________________________ Philadelphia-pm mailing list Philadelphia-pm@pm.org http://mail.pm.org/mailman/listinfo/philadelphia-pm