Adam Turoff on Mon, 26 Jun 2000 16:13:36 -0400 (EDT) |
----- Forwarded message ----- Date: Mon, 26 Jun 2000 16:10:54 -0400 (EDT) From: "Schuyler D. Erle" <sderle@asi.alphaaccess.net> To: phl@lists.pm.org Subject: Re: regex question > Notice that in the last three lines, <BR> tags appear embedded within > another HTML tag. I need a regular expression that will read in each > line, and if the <BR> is within a larger <...> string, the <BR> will be > removed. Otherwise, the line will be left alone. > > I can't quite pull this off (so far). Does anyone have a suggestion? How about: perl -pi -e 's/(<[^<>]+)<BR>([^<>]+>)/$1$2/gios' broken.html The first backreference captures the start of the surrounding tag (which begins with < and contains one or more of anything other than < or >). Then <BR> is matched. Finally, the second back reference captures all non-angle-bracket chars up to the closing >. The substitution just concatenates the two halves of the surrounding tag minus the <BR>. SDE ----- End forwarded message ----- **Majordomo list services provided by PANIX <URL:http://www.panix.com>** **To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**
|
|