Articulatory data offers promising developments in our understanding of speech production and advances in speech technologies. However, it is more expensive and difficult to obtain than audio data, which means data collection must be carefully planned. This paper presents a method for designing an articulatory speech corpus comparable to the widely-used TIMIT corpus, for languages other than English, using Italian as a case study. This data-driven method searches freely-available online text corpora for a set of sentences that provide broad phonetic coverage, while still being small enough to be read in a single session, which is important given the often invasive nature of articulatory data collection. Sentences are first phonemically tran...
Modelling the process that a listener actuates in deriving the words intended by a speaker requires ...
International audienceMost speech and language technologies are trained with massive amounts of spee...
A major hurdle in data-driven research on typology is having sufficient data in many languages to dr...
A method is proposed for compiling a corpus of phonetically-rich triphone sentences; i.e., sentences...
In this paper we introduce a new Italian dataset consisting of simultaneous recordings of continuous...
The LaMIT database consists in recordings of 100 Italian sentences. The sentences in the database we...
International audienceThis paper presents an extension to a very low-resource parallel corpus collec...
This document describes the results of an activity that was conducted at ITC-Irst for the design of ...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
This paper deals with the design of a speech corpus for a concatenation-based text-to-speech (TTS) s...
The human speech apparatus is a rich source of information and offers many cues in the speech signal...
The present paper outlines the Vergina speech database, which was developed in support of research a...
This paper deals with the design of a speech corpus for a concatenation-based text-to-speech (TTS) s...
Designing text scripts that cover enough phonetic units and prosodic phenomena is very important whe...
A novel framework for automatic articulatory-acoustic feature extraction has been developed for enha...
Modelling the process that a listener actuates in deriving the words intended by a speaker requires ...
International audienceMost speech and language technologies are trained with massive amounts of spee...
A major hurdle in data-driven research on typology is having sufficient data in many languages to dr...
A method is proposed for compiling a corpus of phonetically-rich triphone sentences; i.e., sentences...
In this paper we introduce a new Italian dataset consisting of simultaneous recordings of continuous...
The LaMIT database consists in recordings of 100 Italian sentences. The sentences in the database we...
International audienceThis paper presents an extension to a very low-resource parallel corpus collec...
This document describes the results of an activity that was conducted at ITC-Irst for the design of ...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
This paper deals with the design of a speech corpus for a concatenation-based text-to-speech (TTS) s...
The human speech apparatus is a rich source of information and offers many cues in the speech signal...
The present paper outlines the Vergina speech database, which was developed in support of research a...
This paper deals with the design of a speech corpus for a concatenation-based text-to-speech (TTS) s...
Designing text scripts that cover enough phonetic units and prosodic phenomena is very important whe...
A novel framework for automatic articulatory-acoustic feature extraction has been developed for enha...
Modelling the process that a listener actuates in deriving the words intended by a speaker requires ...
International audienceMost speech and language technologies are trained with massive amounts of spee...
A major hurdle in data-driven research on typology is having sufficient data in many languages to dr...