Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML documents essentially reduces to learning deterministic regular expressions from sets of positive example words. Unfortunately, there is no algorithm capable of learning the complete class of deterministic regular expressions from positive examples only, as we will show. The regular ex-pressions occurring in practical DTDs and XSDs, however, are such that every alphabet symbol occurs only a small number of times. As such, in practice it suffices to learn the subclass of deterministic regular expressions in which each alphabet symbol occurs at most k times, for some small k. We refer to such expressions as k-occurrence regular expressions (k-OREs for sh...
Summarization: XML is rapidly emerging as the new standard for data representation and exchange on t...
We propose regular expression pattern matching as a core feature for programming languages for manip...
International audienceWe investigate the complexity of deciding whether a given regular language can...
We consider the problem to infer a concise Document Type Definition (DTD) for a given set of XML-doc...
Although the presence of a schema enables many optimizations for operations on XML documents, recent...
The paper shows that concise DTDs can be inferred from XML documents by effectively constructing cor...
Deterministic regular expressions are widely used in XML processing. For instance, all regular expre...
We study the problem of generalizing from a finite sample to a language taken from a predefined lang...
International audienceDeterministic regular expressions are widely used in XML processing. For insta...
Most modern libraries for regular expression matching allow back-references (i. e., repetition opera...
XML is rapidly emerging as the new standard for data representation and exchange on the Web. Documen...
Regular expression pattern matching for XML We propose regular expression pattern matching as a core...
AbstractXML is a widely used technology. Although in most real life applications XML data is require...
We examine two generalizations of 1-deterministic regular languages that are used for the content mo...
AbstractWe examine two generalizations of 1-deterministic regular languages that are used for the co...
Summarization: XML is rapidly emerging as the new standard for data representation and exchange on t...
We propose regular expression pattern matching as a core feature for programming languages for manip...
International audienceWe investigate the complexity of deciding whether a given regular language can...
We consider the problem to infer a concise Document Type Definition (DTD) for a given set of XML-doc...
Although the presence of a schema enables many optimizations for operations on XML documents, recent...
The paper shows that concise DTDs can be inferred from XML documents by effectively constructing cor...
Deterministic regular expressions are widely used in XML processing. For instance, all regular expre...
We study the problem of generalizing from a finite sample to a language taken from a predefined lang...
International audienceDeterministic regular expressions are widely used in XML processing. For insta...
Most modern libraries for regular expression matching allow back-references (i. e., repetition opera...
XML is rapidly emerging as the new standard for data representation and exchange on the Web. Documen...
Regular expression pattern matching for XML We propose regular expression pattern matching as a core...
AbstractXML is a widely used technology. Although in most real life applications XML data is require...
We examine two generalizations of 1-deterministic regular languages that are used for the content mo...
AbstractWe examine two generalizations of 1-deterministic regular languages that are used for the co...
Summarization: XML is rapidly emerging as the new standard for data representation and exchange on t...
We propose regular expression pattern matching as a core feature for programming languages for manip...
International audienceWe investigate the complexity of deciding whether a given regular language can...