In this paper, we describe some techniques to learn probabilistic k-testable tree models, a generalization of the well known k-gram models, that can be used to compress or classify structured data. These models are easy to infer from samples and allow for incremental updates. Moreover, as shown here, backing-off schemes can be defined to solve data sparseness, a problem that often arises when using trees to represent the data. These features make them suitable to compress structured data files at a better rate than string-based methods.The Spanish Comisión Interministerial de Ciencia y Tecnología through Grants TIC2003-08681-C02 and TIC2003-08496-C04
In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), wh...
International audienceIn this paper, we aim at correcting distributions of noisy samples in order to...
We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Gramm...
Probabilistic k-testable models (usually known as k-gram models in the case of strings) can be easil...
International audienceIn front of the large increase of the available amount of structured data (suc...
In this paper, the identification of stochastic regular languages is addressed. For this purpose, we...
We focus on the classical problem in grammatical inference of learning stochas-tic tree languages fr...
International audienceApplications of probabilistic grammatical inference are limited due to time an...
Classification learning is a type of supervised machine learning technique that uses a classificatio...
International audienceRecently, an algorithm, DEES, was proposed for learning rational stochastic tr...
International audienceWe consider the problem of learning stochastic tree languages from a sample of...
In this paper, we compare three different approaches to build a probabilistic context-free grammar f...
In this paper, we explore the use of Random Forests (RFs) in the struc-tured language model (SLM), w...
The possibility of using stochastic context-free grammars (SCFG's) in language modeling (LM) has bee...
We develop a new class of hierarchical stochastic models called spatial random trees (SRTs) which ad...
In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), wh...
International audienceIn this paper, we aim at correcting distributions of noisy samples in order to...
We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Gramm...
Probabilistic k-testable models (usually known as k-gram models in the case of strings) can be easil...
International audienceIn front of the large increase of the available amount of structured data (suc...
In this paper, the identification of stochastic regular languages is addressed. For this purpose, we...
We focus on the classical problem in grammatical inference of learning stochas-tic tree languages fr...
International audienceApplications of probabilistic grammatical inference are limited due to time an...
Classification learning is a type of supervised machine learning technique that uses a classificatio...
International audienceRecently, an algorithm, DEES, was proposed for learning rational stochastic tr...
International audienceWe consider the problem of learning stochastic tree languages from a sample of...
In this paper, we compare three different approaches to build a probabilistic context-free grammar f...
In this paper, we explore the use of Random Forests (RFs) in the struc-tured language model (SLM), w...
The possibility of using stochastic context-free grammars (SCFG's) in language modeling (LM) has bee...
We develop a new class of hierarchical stochastic models called spatial random trees (SRTs) which ad...
In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), wh...
International audienceIn this paper, we aim at correcting distributions of noisy samples in order to...
We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Gramm...