International audienceIn this paper, a new methodology for speech corpora definition from internet documents is described, in order to record a large speech database, dedicated to the training and testing of acoustic models for speech recognition. In the first section, the Web robot which is in charge of collecting Web pages from Internet is presented, then the web text to French sentences filtering mechanism is explained. Some information about the corpus organization (90% for training and 10% for test) is given. In the third section, the phoneme distribution of the corpus is presented and comparison is made with others French language studies. Finally tools and planning for recording the speech database with more than one hundred speakers...
Les trois piliers d’un système de reconnaissance automatique de la parole sont le lexique,le modèle ...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...
In this paper, a new methodology for speech corpora definition from internet documents is described,...
In statistical language modelling researches, there is a lack of huge text corpora, especially for s...
International audienceSpoken language speech recognition systems need better understanding of natura...
The three pillars of an automatic speech recognition system are the lexicon, the languagemodel and t...
This paper presents the results of the NEOLOGOS project: a children database and an optimized adult ...
This paper describes the setting up of a resource database for research and evaluation in the domain...
Language models used in current automatic speech recognition systems are trained on general-purpose ...
The WWW is a ubiquitous, mature communication infrastruc-ture for business and scientific informatio...
This paper describes methods that exploit stenographic transcripts of the German parliament to impro...
International audienceThis paper discusses the adaptation of speech recognition vocabularies for aut...
International audienceLanguage registers are a strongly perceptible characteristic of texts and spee...
International audienceThe construction of a speech recognition system requires a recorded set of phr...
Les trois piliers d’un système de reconnaissance automatique de la parole sont le lexique,le modèle ...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...
In this paper, a new methodology for speech corpora definition from internet documents is described,...
In statistical language modelling researches, there is a lack of huge text corpora, especially for s...
International audienceSpoken language speech recognition systems need better understanding of natura...
The three pillars of an automatic speech recognition system are the lexicon, the languagemodel and t...
This paper presents the results of the NEOLOGOS project: a children database and an optimized adult ...
This paper describes the setting up of a resource database for research and evaluation in the domain...
Language models used in current automatic speech recognition systems are trained on general-purpose ...
The WWW is a ubiquitous, mature communication infrastruc-ture for business and scientific informatio...
This paper describes methods that exploit stenographic transcripts of the German parliament to impro...
International audienceThis paper discusses the adaptation of speech recognition vocabularies for aut...
International audienceLanguage registers are a strongly perceptible characteristic of texts and spee...
International audienceThe construction of a speech recognition system requires a recorded set of phr...
Les trois piliers d’un système de reconnaissance automatique de la parole sont le lexique,le modèle ...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...
The Bavarian Archive for Speech Signals has released three new speech corpora for both industrial an...