In this paper, we present an automatic approach for aligning speech signals to corresponding text documents. For this sake, we propose to first use text-to-speech synthesis (TTS) to obtain a speech signal from the textual representation. Subsequently, both speech signals are transformed to sequences of audio features which are then time-aligned using a variant of greedy dynamic time-warping (DTW). The proposed approach is both efficient (with linear running time), computationally simple, and does not rely on a prior training phase as it is necessary when using HMM-based approaches. It benefits from the combination of a) a novel type of speech feature, being correlated to the phonetic progression of speech, b) a greedy left-to-right variant ...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at...
In this paper, we present an automatic approach for aligning speech signals to corresponding text do...
Synchronisation of a voice recording with the corresponding text is a common task in speech and musi...
The purpose of this work is to research existing text-to-speech aligning algorithms. We chose an imp...
Speech synthesis and recognition are the basic techniques used for man-machine communication. This t...
For many years, film and television have dominated the entertainment industry. Recently, with the in...
The duration of a speech passage can be altered using audio time-scale modification techniques. Time...
Contains fulltext : 190161.pdf (publisher's version ) (Closed access)In language p...
Abstract—Lightly supervised acoustic modeling in under-resourced languages raises new issues due to ...
The duration of a speech passage can be altered using audio time-scale modification techniques. Time...
Most studies in automatic synchronization of speech and transcription focus on the synchronization a...
The purpose of a Text to Speech (TTS/T2S) synthesis is to provide artificial voice for a people and ...
Speech data-bases are an important issue in the study of speech communication. In particular, time-a...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at...
In this paper, we present an automatic approach for aligning speech signals to corresponding text do...
Synchronisation of a voice recording with the corresponding text is a common task in speech and musi...
The purpose of this work is to research existing text-to-speech aligning algorithms. We chose an imp...
Speech synthesis and recognition are the basic techniques used for man-machine communication. This t...
For many years, film and television have dominated the entertainment industry. Recently, with the in...
The duration of a speech passage can be altered using audio time-scale modification techniques. Time...
Contains fulltext : 190161.pdf (publisher's version ) (Closed access)In language p...
Abstract—Lightly supervised acoustic modeling in under-resourced languages raises new issues due to ...
The duration of a speech passage can be altered using audio time-scale modification techniques. Time...
Most studies in automatic synchronization of speech and transcription focus on the synchronization a...
The purpose of a Text to Speech (TTS/T2S) synthesis is to provide artificial voice for a people and ...
Speech data-bases are an important issue in the study of speech communication. In particular, time-a...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
[[abstract]]This article investigates the correlations between multimedia objects (particularly spee...
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at...