John Von Essen on 18 Jan 2006 04:12:28 -0000

Re: [PLUG] Perl question...

If the dynamics of the line structure cannot be explained via a single rule, then you will have to use regular expressions in perl to grab the fields given the logic that surrounds it. For example, if the columns were delimited by tab or 2 or more whitespace and the text fields had only single whitespace instances, you could do:

if($_ =~ m/ (\d+)(\s\s+|\t+)(\d+)(\s\s+|\t+)(\d+)(\s\s+|\t+)(.+)(\s\s+|\t+)(.+)(\s\s +|\t+)(.+)/)
$first_entry = $1;
$second_entry = $3;
$third_entry = $5;
$fourth_entry = $7;
$fifth_entry = $9;
$sixth_entry = $11;
print "$sixth_entry\n";


On Jan 17, 2006, at 9:28 PM, Eric wrote:

Thanks for the suggestion.  Unfortunately, some of the data later
in the file has embedded spaces in the text making split unworkable.

I just went with the fixed column thing and "corrected" the tabs. :-)


On Tuesday 17 January 2006 9:01 pm, Toby DiPasquale wrote:
On Tue, Jan 17, 2006 at 06:55:22PM -0500, Eric wrote:
I know there are perl hackers about...

I have a file - a typical line might look like this:

1 1 1 Ownership FeeSimple
Fee Simple

The desired data is the last field - indexed by the first three and forget
other two. The "hitch", if you will, is that I'm using unpack to get the
fields like this:

open EAT, "<Translation.txt" or die "bummer dude - no file?\n" ;
while (<EAT>) {
($f1, $f2, $f3, $junk, $junk2, $data) = unpack("a10, a9, a8, a26, a18,
a35 ", $_) ;
  print $data ;

I don't know about Perl, but in Ruby this is:

#!/usr/bin/env ruby
IO.foreach("Translation.txt") do |line|
  f1, f2, f3, j1, j2, data = line.split
  puts data

I believe that Perl has a split() function, as well. Maybe use that
instead of unpack?

Toby DiPasquale
# Eric A Lucas
# ------------
# "Oh, I have slipped the surly bond of earth
# and danced the skies on laughter-silvered wings...
# -- John Gillespie Magee Jr.
