This paper presents a novel method to infer regular expressions from positive examples. The method consists of a candidate’s construction phase and an optimization phase. We first propose multiscaling sample augmentation to capture the cycle patterns from single examples during the candidate’s construction phase. We then use common substrings to build regular expressions that capture patterns across multiple examples, and we show this algorithm is more general than those based on common prefixes or suffixes. Furthermore, we propose a pruning mechanism to improve the efficiency of useful common substring mining, which is an important part of common substring-based expression building algorithm. Finally, in the optimization phase, we model th...
We consider the long-standing problem of the automatic generation of regular expressions for text ex...
Summarization: Discovering sequential patterns is an important problem in data mining with a host of...
Regular expressions are routinely used in a variety of different application domains. But building a...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
AbstractWe describe algorithms that directly infer very simple forms of 1-unambiguous regular expres...
A learning algorithm is developed for a class of regular expressions equivalent to the class of all ...
A large class of entity extraction tasks from text that is either semistructured or fully unstructur...
AbstractWe describe algorithms that directly infer very simple forms of 1-unambiguous regular expres...
We explore the practical feasibility of a system based on genetic programming (GP) for the automatic...
We propose a system for the automatic generation of regular expressions for text-extraction tasks. T...
Summarization: Discovering sequential patterns is an important problem in data mining with a host of...
Given a regular expression R and a string Q, the regular expression parsing problem is to determine ...
In the recent years, Machine Learning techniques have emerged as a new way to obtain solutions for a...
We consider the long-standing problem of the automatic generation of regular expressions for text ex...
Summarization: Discovering sequential patterns is an important problem in data mining with a host of...
Regular expressions are routinely used in a variety of different application domains. But building a...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
This paper presents a novel method to infer regular expressions from positive examples. The method c...
AbstractWe describe algorithms that directly infer very simple forms of 1-unambiguous regular expres...
A learning algorithm is developed for a class of regular expressions equivalent to the class of all ...
A large class of entity extraction tasks from text that is either semistructured or fully unstructur...
AbstractWe describe algorithms that directly infer very simple forms of 1-unambiguous regular expres...
We explore the practical feasibility of a system based on genetic programming (GP) for the automatic...
We propose a system for the automatic generation of regular expressions for text-extraction tasks. T...
Summarization: Discovering sequential patterns is an important problem in data mining with a host of...
Given a regular expression R and a string Q, the regular expression parsing problem is to determine ...
In the recent years, Machine Learning techniques have emerged as a new way to obtain solutions for a...
We consider the long-standing problem of the automatic generation of regular expressions for text ex...
Summarization: Discovering sequential patterns is an important problem in data mining with a host of...
Regular expressions are routinely used in a variety of different application domains. But building a...