Phoneme-based multilingual connectionist temporal classification (CTC) model is easily extensible to a new language by concatenating parameters of the new phonemes to the output layer. In the present paper, we improve cross-lingual adaptation in the context of phoneme-based CTC models by using phonological information. A universal (IPA) phoneme classifier is first trained on phonological features generated from a phonological attribute detector. When adapting the multilingual CTC to a new, never seen, language, phonological attributes of the unseen phonemes are derived based on phonology and fed into the phoneme classifier. Posteriors given by the classifier are used to initialize the parameters of the unseen phonemes when extending the mul...
The paper investigates the use of connectionist approaches to discover phonotactic preferences in sy...
The paper is a study on a new class of spatial-temporal evolving fuzzy neural network systems (EFuNN...
Previous research indicates that automatic language identification systems based on phonotactic info...
Phoneme-based multilingual connectionist temporal classification (CTC) model is easily extensible to...
Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to...
Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to...
We investigate multilingual modeling in the context of a deep neural network (DNN) – hidden Markov ...
This paper presents a study on multilingual deep neural net-work (DNN) based acoustic modeling and i...
In this article, we propose a simple yet effective approach to train an end-to-end speech recognitio...
Phonological-based features (articulatory features, AFs) describe the movements of the vocal organ w...
In this paper, a self-supervised learning pre-trained model is proposed and successfully applied in ...
Multilingual speech recognition systems mostly benefit low resource languages but suffer degradation...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferrin...
Different training and adaptation techniques for multilingual Automatic Speech Recognition (ASR) are...
In this paper we present our latest investigation on multilingual bottle-neck (BN) features and its ...
The paper investigates the use of connectionist approaches to discover phonotactic preferences in sy...
The paper is a study on a new class of spatial-temporal evolving fuzzy neural network systems (EFuNN...
Previous research indicates that automatic language identification systems based on phonotactic info...
Phoneme-based multilingual connectionist temporal classification (CTC) model is easily extensible to...
Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to...
Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to...
We investigate multilingual modeling in the context of a deep neural network (DNN) – hidden Markov ...
This paper presents a study on multilingual deep neural net-work (DNN) based acoustic modeling and i...
In this article, we propose a simple yet effective approach to train an end-to-end speech recognitio...
Phonological-based features (articulatory features, AFs) describe the movements of the vocal organ w...
In this paper, a self-supervised learning pre-trained model is proposed and successfully applied in ...
Multilingual speech recognition systems mostly benefit low resource languages but suffer degradation...
Deep neural network (DNN) acoustic models can be adapted to under-resourced languages by transferrin...
Different training and adaptation techniques for multilingual Automatic Speech Recognition (ASR) are...
In this paper we present our latest investigation on multilingual bottle-neck (BN) features and its ...
The paper investigates the use of connectionist approaches to discover phonotactic preferences in sy...
The paper is a study on a new class of spatial-temporal evolving fuzzy neural network systems (EFuNN...
Previous research indicates that automatic language identification systems based on phonotactic info...