[PLUG] Looking for a poke in the right sed/awk/regex direction

Doug Stewart on 25 Oct 2009 07:07:45 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] Looking for a poke in the right sed/awk/regex direction

From: Doug Stewart <zamoose@gmail.com>

To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>

Subject: [PLUG] Looking for a poke in the right sed/awk/regex direction

Date: Sun, 25 Oct 2009 10:07:19 -0400

Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>

Sender: plug-bounces@lists.phillylinux.org

Howdy all,
I've got a flat text file that contains a lot of text that was copied from a PDF. Unfortunately, the copy process retained the formatted file's line breaks, meaning that the flat text file has many unnecessary line breaks that mess with the formatting if you change the dimensions of the view port.

So what I need is a little sed/awk/regex magic that will search the text file for all unnecessary line breaks and strip them out. You can identify the unnecessary line breaks as follows:
1) Proper line breaks are followed by a space on the beginning of the next line, e.g.
" The quick brown fox"
2) Improper line breaks have no space at the beginning, e.g.
"jumps over the lazy dog"

So, I need to
1) Detect all occurrences of lines with a leading space that
2) Are followed by a line with NO leading space and
3) Delete the line break between the two, essentially merging the two lines.

Any ideas?

--
-Doug
http://literalbarrage.org/blog/

___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

Follow-Ups:

Re: [PLUG] Looking for a poke in the right sed/awk/regex direction
From: Walt Mankowski <waltman@pobox.com>

Prev by Date: Re: [PLUG] Locking down a Web browser on Ubuntu

Next by Date: [PLUG] Fwd: Save the date! Philly Karmic Koala Release Party, Oct 31st from 2-6PM

Previous by thread: Re: [PLUG] Locking down a Web browser on Ubuntu

Next by thread: Re: [PLUG] Looking for a poke in the right sed/awk/regex direction

Index(es):

Date

Thread