Bonnie Aumann on 20 May 2009 19:36:34 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Fuzzy Matching resources?

  • From: Bonnie Aumann <aumannb@gmail.com>
  • To: Philly Lambda <philly-lambda@googlegroups.com>
  • Subject: Fuzzy Matching resources?
  • Date: Wed, 20 May 2009 19:36:22 -0700 (PDT)
  • Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=beta; h=domainkey-signature:received:received:x-sender:x-apparently-to :mime-version:received:date:x-ip:user-agent:x-http-useragent :message-id:subject:from:to:content-type:content-transfer-encoding :reply-to:sender:precedence:x-google-loop:mailing-list:list-id :list-post:list-help:list-unsubscribe:x-beenthere-env:x-beenthere; bh=cmXbKVzNDWo9xb8XTz4bl7oxIUvcnyOY3ZS/ZMWwjAE=; b=DxiSCWXKBS5THXKpcCNwOxiS2YafCttBMUnMkp5iZUmEAvsnmAsNKsytPTPlTcmUuq cG/UaNoNzTKHurHG2g661y/s5PczXJKf8/D1rDTnnL5K3pJNVTrrms+OCDJiPL6grkNl nEMj8xBEOdaigY4K7vSlUdzAhyhe2l/XySpfA=
  • Mailing-list: list philly-lambda@googlegroups.com; contact philly-lambda+owner@googlegroups.com
  • Reply-to: philly-lambda@googlegroups.com
  • Sender: philly-lambda@googlegroups.com
  • User-agent: G2/1.0

Hey all,

I have a project that requires that I clean up a list of student-
entered teacher names (Mrs. Powell, MRS. Powell,  Ms. Powle, etc, ==
Mrs. Jane Powell). I know the list of ideal names.

The whole project involves combining 30 or so excel files, about 6-20k
lines, and de-duping them. One part of the de-duping is fixing this
single field. Kyle Burton suggested using the Fuzzy Matching module.
I'm a rank amateur at programming, let alone Perl specifically, but I
hope someone has something that could help me out.

Thanks,

Bonnie