We examine document spanners, a formal framework for information extraction that was introduced by Fagin et al. (PODS 2013). A document spanner is a function that maps an input string to a relation over spans (intervals of positions of the string). We focus on document spanners that are defined by regex formulas, which are basically regular expressions that map matched subexpressions to corresponding spans, and on core spanners, which extend the former by standard algebraic operators and string equality selection. First, we compare the expressive power of core spanners to three models - namely, patterns, word equations, and a rich and natural subclass of extended regular expressions (regular expressions with a repetition operator). These ...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
We examine document spanners, a formal framework for information extraction that was introduced by F...
Document spanners are a formal framework for information extraction that was introduced by [Fagin, K...
Document spanners are a formal framework for information extraction that was introduced by Fagin, Ki...
A document spanner models a program for Information Extraction (IE) as a function that takes as inpu...
An intrinsic part of information extraction is the creation and ma-nipulation of relations extracted...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Regular expressions with capture variables, also known as regex-formulas,extract relations of spans ...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
We examine document spanners, a formal framework for information extraction that was introduced by F...
Document spanners are a formal framework for information extraction that was introduced by [Fagin, K...
Document spanners are a formal framework for information extraction that was introduced by Fagin, Ki...
A document spanner models a program for Information Extraction (IE) as a function that takes as inpu...
An intrinsic part of information extraction is the creation and ma-nipulation of relations extracted...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Regular expressions with capture variables, also known as regex-formulas,extract relations of spans ...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
Most modern implementations of regular expression engines allow the use of variables (also called ba...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...