We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the results with a small delay between any two consecutive results. Our goal is to have an algorithm which is tractable in combined complexity, i.e., in the sizes of the input document and the VA; while ensuring the best possible data complexity bounds in the input document size, i.e., constant delay in the document size. Several recent works at PODS'18 pr...
Document spanners are a formal framework for information extraction that was introduced by Fagin, Ki...
Document spanners are a formal framework for information extraction that was introduced by [Fagin, K...
Marx (STOC 2010, J. ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) ...
We consider the information extraction framework known as document spanners, and study the problem o...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
International audienceWe survey some of the recent results about enumerating the answers to queries ...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
International audienceWe study the problem of enumerating the satisfying valuations of a circuit whi...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
International audienceIn this article, we study the problem of enumerating the models of DNF formula...
We introduce annotated grammars, an extension of context-free grammars which allows annotations on t...
Some of the most relevant document schemas used online, such as XML and JSON, have a nested format. ...
International audienceRecently, Creignou et al. (Theory Comput. Syst. 2017) have introduced the clas...
Document spanners are a formal framework for information extraction that was introduced by Fagin, Ki...
Document spanners are a formal framework for information extraction that was introduced by [Fagin, K...
Marx (STOC 2010, J. ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) ...
We consider the information extraction framework known as document spanners, and study the problem o...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
Regular expressions and automata models with capture variables are core tools in rule-based informat...
International audienceWe survey some of the recent results about enumerating the answers to queries ...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
The present paper investigates the dynamic complexity of document spanners, a formal framework for i...
International audienceWe study the problem of enumerating the satisfying valuations of a circuit whi...
We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations ex...
International audienceIn this article, we study the problem of enumerating the models of DNF formula...
We introduce annotated grammars, an extension of context-free grammars which allows annotations on t...
Some of the most relevant document schemas used online, such as XML and JSON, have a nested format. ...
International audienceRecently, Creignou et al. (Theory Comput. Syst. 2017) have introduced the clas...
Document spanners are a formal framework for information extraction that was introduced by Fagin, Ki...
Document spanners are a formal framework for information extraction that was introduced by [Fagin, K...
Marx (STOC 2010, J. ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) ...