This paper describes the technique for the large text and the large audio alignment. This technique includes a segmentation of the audio into homogeneous speech segments, a recognition of each speech fragment using speech recognizer, a description of each speech fragment by keywords that are selected from the output of the speech recognizer on the base of acoustic confidence score and on the base of salience with respect to the other speech fragments. The sentences of the text are described by the same keywords. The global alignment between the large text and the large audio using only keywords gives rough correspondence between the sentences of the text and the audio fragments. The next recognition pass is based on the finite state automat...
This paper describes the ALISA tool, which implements a lightly supervised method for sentence-level...
International audienceA multilingual long audio alignment system is presented in the automatic subti...
Contains fulltext : 190161.pdf (publisher's version ) (Closed access)In language p...
In this paper we address the problem of aligning very long (of-ten more than one hour) audio files t...
The purpose of this work is to research existing text-to-speech aligning algorithms. We chose an imp...
For many years, film and television have dominated the entertainment industry. Recently, with the in...
Abstract—Lightly supervised acoustic modeling in under-resourced languages raises new issues due to ...
The recent uprise of end-to-end speech translation models requires a new generation of parallel corp...
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at...
We report on an audio retrieval system which lets Internet users efficiently access a large audio da...
The paper presents methods for evaluating the accuracy of alignments between transcriptions and audi...
Abstract. This paper presents a novel alignment approach for imperfect speech and the corresponding ...
The paper presents methods for evaluating the accuracy of alignments between transcriptions and audi...
We describe and analyze a discriminative algorithm for learning to align a phoneme sequence of a spe...
Abstract—Speech annotation is a costly and time consuming process because it requires high accuracy....
This paper describes the ALISA tool, which implements a lightly supervised method for sentence-level...
International audienceA multilingual long audio alignment system is presented in the automatic subti...
Contains fulltext : 190161.pdf (publisher's version ) (Closed access)In language p...
In this paper we address the problem of aligning very long (of-ten more than one hour) audio files t...
The purpose of this work is to research existing text-to-speech aligning algorithms. We chose an imp...
For many years, film and television have dominated the entertainment industry. Recently, with the in...
Abstract—Lightly supervised acoustic modeling in under-resourced languages raises new issues due to ...
The recent uprise of end-to-end speech translation models requires a new generation of parallel corp...
The synchronization of text transcripts with audio tracks is typically solved by forced alignment at...
We report on an audio retrieval system which lets Internet users efficiently access a large audio da...
The paper presents methods for evaluating the accuracy of alignments between transcriptions and audi...
Abstract. This paper presents a novel alignment approach for imperfect speech and the corresponding ...
The paper presents methods for evaluating the accuracy of alignments between transcriptions and audi...
We describe and analyze a discriminative algorithm for learning to align a phoneme sequence of a spe...
Abstract—Speech annotation is a costly and time consuming process because it requires high accuracy....
This paper describes the ALISA tool, which implements a lightly supervised method for sentence-level...
International audienceA multilingual long audio alignment system is presented in the automatic subti...
Contains fulltext : 190161.pdf (publisher's version ) (Closed access)In language p...