Regular expressions with capture variables, also known as regex-formulas,extract relations of spans (intervals identified by their start and endindices) from text. In turn, the class of regular document spanners is theclosure of the regex formulas under the Relational Algebra. We investigate thecomputational complexity of querying text by aggregate functions, such as sum,average, and quantile, on top of regular document spanners. To this end, weformally define aggregate functions over regular document spanners and analyzethe computational complexity of exact and approximate computation. Moreprecisely, we show that in a restricted case, all studied aggregate functionscan be computed in polynomial time. In general, however, even though exactc...
AbstractWe investigate the problem of how to extend constraint query languages with aggregate operat...
Practical database query languages are usually equipped with some aggregate functions. For example, ...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
A document spanner models a program for Information Extraction (IE) as a function that takes as inpu...
We examine document spanners, a formal framework for information extraction that was introduced by F...
We examine document spanners, a formal framework for information extraction that was introduced by F...
We present an algorithm for searching regular expression matches in compressed text. The algorithm r...
AbstractIn this paper, we study the succinctness of regular expressions (REs) extended with interlea...
Regular expressions constitute a fundamental notion in formal language theory and are frequently use...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Aggregate factors (that is, those based on aggregate functions such as SUM, AVERAGE, AND etc.) in pr...
An intrinsic part of information extraction is the creation and ma-nipulation of relations extracted...
In previous work [10], we considered algorithms related to the statistics of matches with words and...
AbstractWe investigate the problem of how to extend constraint query languages with aggregate operat...
Practical database query languages are usually equipped with some aggregate functions. For example, ...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
A document spanner models a program for Information Extraction (IE) as a function that takes as inpu...
We examine document spanners, a formal framework for information extraction that was introduced by F...
We examine document spanners, a formal framework for information extraction that was introduced by F...
We present an algorithm for searching regular expression matches in compressed text. The algorithm r...
AbstractIn this paper, we study the succinctness of regular expressions (REs) extended with interlea...
Regular expressions constitute a fundamental notion in formal language theory and are frequently use...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Aggregate factors (that is, those based on aggregate functions such as SUM, AVERAGE, AND etc.) in pr...
An intrinsic part of information extraction is the creation and ma-nipulation of relations extracted...
In previous work [10], we considered algorithms related to the statistics of matches with words and...
AbstractWe investigate the problem of how to extend constraint query languages with aggregate operat...
Practical database query languages are usually equipped with some aggregate functions. For example, ...
This paper investigates regex CQs with string equalities (SERCQs), a subclass of core spanners. As s...