This paper presents a state-of-the-art model for transcribing speech in any language into the International Phonetic Alphabet (IPA). Transcription of spoken languages into IPA is an essential yet time-consuming process in language documentation, and even partially automating this process has the potential to drastically speed up the documentation of endangered languages. Like the previous best speech-to-IPA model (Wav2Vec2Phoneme), our model is based on wav2vec 2.0 and is fine-tuned to predict IPA from audio input. We use training data from seven languages from CommonVoice 11.0, transcribed into IPA semi-automatically. Although this training dataset is much smaller than Wav2Vec2Phoneme's, its higher quality lets our model achieve comparable...
International audienceAutomatic speech recognition tools have potential for facilitating language do...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Rapid deployment of automatic speech recognition (ASR) in new languages, with very limited data, is ...
In this paper, we introduce a massively multilingual speech corpora with fine-grained phonemic trans...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
High quality transcription data is crucial for training automatic speech recognition (ASR) systems. ...
Automatic speech recognition tools have potential for facilitating language documentation, but in pr...
International audienceAutomatic phonemic transcription tools now reach high levels of accuracy on a ...
© 2017 Dr. Oliver AdamsMany of the world's languages are falling out of use without a written record...
Automatic speech transcription systems are developed for various languages, domains,and applications...
Audio transcription systems typically require that the language being spoken be specified explicitly...
In this paper, we present a methodology for linguistic feature extraction, focusing particularly on ...
We present a method for cross-lingual training an ASR system using absolutely no transcribed trainin...
Contains fulltext : 30223_valiaugea.pdf (publisher's version ) (Open Access)Broad ...
International audienceAutomatic speech recognition tools have potential for facilitating language do...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Rapid deployment of automatic speech recognition (ASR) in new languages, with very limited data, is ...
In this paper, we introduce a massively multilingual speech corpora with fine-grained phonemic trans...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
High quality transcription data is crucial for training automatic speech recognition (ASR) systems. ...
Automatic speech recognition tools have potential for facilitating language documentation, but in pr...
International audienceAutomatic phonemic transcription tools now reach high levels of accuracy on a ...
© 2017 Dr. Oliver AdamsMany of the world's languages are falling out of use without a written record...
Automatic speech transcription systems are developed for various languages, domains,and applications...
Audio transcription systems typically require that the language being spoken be specified explicitly...
In this paper, we present a methodology for linguistic feature extraction, focusing particularly on ...
We present a method for cross-lingual training an ASR system using absolutely no transcribed trainin...
Contains fulltext : 30223_valiaugea.pdf (publisher's version ) (Open Access)Broad ...
International audienceAutomatic speech recognition tools have potential for facilitating language do...
Contains fulltext : 27415.pdf (publisher's version ) (Open Access)Each time a word...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...