We propose a new approach for phoneme mapping in cross-lingual transfer learning for text-to-speech (TTS) in under-resourced languages (URLs), using phonological features from the PHOIBLE database and a language-independent mapping rule. This approach was validated through our experiment, in which we pre-trained acoustic models in Dutch, Finnish, French, Japanese, and Spanish, and fine-tuned them with 30 minutes of Frisian training data. The experiment showed an improvement in both naturalness and pronunciation accuracy in the synthesized Frisian speech when our mapping approach was used. Since this improvement also depended on the source language, we then experimented on finding a good criterion for selecting source languages. As an altern...
We provide a systematic review of past studies that use multilingual data for text-to-speech (TTS) o...
In computer-assisted pronunciation training (CAPT), the scarcity of large-scale non-native corpora a...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferrin...
We propose a new approach for phoneme mapping in cross-lingual transfer learning for text-to-speech ...
We compare using a PHOIBLE-based phone mapping methodand using phonological features input in transf...
For small-vocabulary applications, a mapped pronuncia-tion lexicon can enable speech recognition in ...
We compare phone labels and articulatory features as input for cross-lingual transfer learning in te...
Exploiting cross-lingual resources is an effective way to compensate for data scarcity of low resour...
This paper presents a novel acoustic modeling technique of large vocabulary automatic speech recogni...
Cross-lingual transfer learning with large multilingual pre-trained models can be an effective appro...
The development of a speech recognition system requires at least three resources: a large labeled sp...
Character-based Neural Network Language Models (NNLM) have the advantage of smaller vocabulary and t...
Only a handful of the world’s languages are abundant with the resources that enable practical applic...
Jiawei ZhaoCurrent machine translation techniques were developed using predominantly rich resource l...
Over the past decades, speech recognition has dramatically improved in a large variety of applicatio...
We provide a systematic review of past studies that use multilingual data for text-to-speech (TTS) o...
In computer-assisted pronunciation training (CAPT), the scarcity of large-scale non-native corpora a...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferrin...
We propose a new approach for phoneme mapping in cross-lingual transfer learning for text-to-speech ...
We compare using a PHOIBLE-based phone mapping methodand using phonological features input in transf...
For small-vocabulary applications, a mapped pronuncia-tion lexicon can enable speech recognition in ...
We compare phone labels and articulatory features as input for cross-lingual transfer learning in te...
Exploiting cross-lingual resources is an effective way to compensate for data scarcity of low resour...
This paper presents a novel acoustic modeling technique of large vocabulary automatic speech recogni...
Cross-lingual transfer learning with large multilingual pre-trained models can be an effective appro...
The development of a speech recognition system requires at least three resources: a large labeled sp...
Character-based Neural Network Language Models (NNLM) have the advantage of smaller vocabulary and t...
Only a handful of the world’s languages are abundant with the resources that enable practical applic...
Jiawei ZhaoCurrent machine translation techniques were developed using predominantly rich resource l...
Over the past decades, speech recognition has dramatically improved in a large variety of applicatio...
We provide a systematic review of past studies that use multilingual data for text-to-speech (TTS) o...
In computer-assisted pronunciation training (CAPT), the scarcity of large-scale non-native corpora a...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferrin...