Colloque avec actes et comité de lecture.This article describes a comparative study of language models in which the evaluation protocol has been set by AUPELF-UREF . We especially pay attention on the comparison between two methods of clustering words which are necessary in the design of the corresponding language models. The first classification is done by following a linguistic and theoretical method and the second one is based on an optimization method. Both methods are evaluated through the Shannon game. The vocabulary used is 20 000 words, the training corpus is made of two years of Le Monde newspaper (42M of words) and the test corpus (400 000 words) is extracted from 6 years of Le Monde Diplomatique. First evaluations show an improve...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
The Brown clustering algorithm (Brown et al., 1992) is widely used in natural language process-ing (...
In the speech recognition of highly inflecting or compounding languages, the traditional word-based ...
Colloque avec actes et comité de lecture. internationale.International audienceThis paper focuses on...
This chapter describes a novel multistage method for linguistic clustering of large collections of t...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Automatic inference of a classification of words has been carried out by several researchers recentl...
In this paper, we propose a new language model based on depen-dent word sequences organized in a mul...
International audienceIn this paper, we propose a new clustering model for speaker diarization. A ma...
This paper presents an exploratory data analysis in lexical acquisition for adjec-tive classes using...
In state-of-the-art large vocabulary automatic recognition systems, a large statistical language mod...
Thesis (PhD)--Stellenbosch University, 2019.ENGLISH ABSTRACT: A pronunciation dictionary is one of t...
The thesis deals with different aspects of automatic speech recognition. After an introduction, whic...
International audienceIn this work, we present a new method for clustering words into equivalence cl...
International audienceWe present a technique to improve out-of-domain statistical parsing by reducin...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
The Brown clustering algorithm (Brown et al., 1992) is widely used in natural language process-ing (...
In the speech recognition of highly inflecting or compounding languages, the traditional word-based ...
Colloque avec actes et comité de lecture. internationale.International audienceThis paper focuses on...
This chapter describes a novel multistage method for linguistic clustering of large collections of t...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Automatic inference of a classification of words has been carried out by several researchers recentl...
In this paper, we propose a new language model based on depen-dent word sequences organized in a mul...
International audienceIn this paper, we propose a new clustering model for speaker diarization. A ma...
This paper presents an exploratory data analysis in lexical acquisition for adjec-tive classes using...
In state-of-the-art large vocabulary automatic recognition systems, a large statistical language mod...
Thesis (PhD)--Stellenbosch University, 2019.ENGLISH ABSTRACT: A pronunciation dictionary is one of t...
The thesis deals with different aspects of automatic speech recognition. After an introduction, whic...
International audienceIn this work, we present a new method for clustering words into equivalence cl...
International audienceWe present a technique to improve out-of-domain statistical parsing by reducin...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
The Brown clustering algorithm (Brown et al., 1992) is widely used in natural language process-ing (...
In the speech recognition of highly inflecting or compounding languages, the traditional word-based ...