|
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
----- Forwarded message -----
Date: Mon, 26 Jun 2000 16:10:54 -0400 (EDT)
From: "Schuyler D. Erle" <sderle@asi.alphaaccess.net>
To: phl@lists.pm.org
Subject: Re: regex question
> Notice that in the last three lines, <BR> tags appear embedded within
> another HTML tag. I need a regular expression that will read in each
> line, and if the <BR> is within a larger <...> string, the <BR> will be
> removed. Otherwise, the line will be left alone.
>
> I can't quite pull this off (so far). Does anyone have a suggestion?
How about:
perl -pi -e 's/(<[^<>]+)<BR>([^<>]+>)/$1$2/gios' broken.html
The first backreference captures the start of the surrounding tag (which
begins with < and contains one or more of anything other than < or >).
Then <BR> is matched. Finally, the second back reference captures all
non-angle-bracket chars up to the closing >. The substitution just
concatenates the two halves of the surrounding tag minus the <BR>.
SDE
----- End forwarded message -----
**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**
|
|