Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

Doug Stewart on 28 Oct 2009 15:31:55 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

From: Doug Stewart <zamoose@gmail.com>

To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>

Subject: Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

Date: Wed, 28 Oct 2009 18:31:48 -0400

Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>

Sender: plug-bounces@lists.phillylinux.org

On 10/26/09, JP Vossen <jp@jpsdomain.org> wrote: > To replace all "word\nword" (newlines) with "word word" (space), try: > perl -0777 -pe 's/(\w+)\n(\w+)/$1 $2/g' bad_file > good_file JP: Yours was the closest answer. The secret was in the multi-line /m flag. I passed the text through this: perl -0777 -pe 's/\n(\S+)/$1 $2/gm' bad_file > good file ...and it resulted in a much, MUCH cleaner file. Note that I removed the requirement for a \w word match to begin the expression and subbed in a \S for the second \w; with the multi-line flag and a subbing-out for non-whitespace instead of word characters (because a line could conceivably start with a number or a quote), I reached almost-data-processing-Nirvana. There's still a little manual clean-up (and I'm going to want to trim leading whitespace off all lines -- a trivial task now), but by-and-large, after much experimentation (and cursing at http://regexr.com), I've got the file into a workable format. Thanks much! -- -Doug http://literalbarrage.org/blog/ ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

Follow-Ups:

Re: [PLUG] Looking for a poke in the right sed/awk/regex direction
From: Randall A Sindlinger <rsindlin+plug@seas.upenn.edu>

References:

Re: [PLUG] Looking for a poke in the right sed/awk/regex direction
From: JP Vossen <jp@jpsdomain.org>

Prev by Date: [PLUG] odd bash and/or less behavior

Next by Date: Re: [PLUG] odd bash and/or less behavior

Previous by thread: Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

Next by thread: Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

Index(es):

Date

Thread