Meng Weng Wong on Wed, 7 Nov 2001 12:36:19 -0500


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: generating regexes?


On Wed, Nov 07, 2001 at 12:08:42PM -0500, Kyle R . Burton wrote:
| I've been reading up on is machine learning.  One of the things I've been 
| toying with is the ability to generate a regex to match a given example
| set of data.  My particualr examples would be for things like phone numbers,
| or zip codes, or information that consists of single data elements.  

this code looks very interesting.  here's a thought off the
top of my head, dating from 1999 when I was doing coursework
in a related field.

i wonder if it would be feasible to evolve your regular
expressions as genetic algorithms in the usual hill-climbing
way.  instead of refining a single pattern, try keeping a
stable of, say, a hundred possible patterns that each may
match only a certain subset of the input data.  refine each
pattern through random mutations, so that a pattern from a
parent generation produces more than one child.  the
selection pressure would be based primary on success against
the input data space and secondarily on length of regexp.

**Majordomo list services provided by PANIX <URL:http://www.panix.com>**
**To Unsubscribe, send "unsubscribe phl" to majordomo@lists.pm.org**