In this paper, we propose a novel method to adapt context-dependent deep neural network hidden Markov model (CD-DNN-HMM) with only limited number of parameters by taking into account the underlying factors that contribute to the distorted speech signal. We derive this factorized adaptation method from the perspectives of joint factor analysis and vector Taylor series expansion, respectively. Evaluated on Aurora 4, the proposed method can get 19.0 % and 10.6 % relative word error rate reduction on test set B and D with only 20 adaptation utterances, and can have decent improvement with as few as two adaptation utterances. We also show that the proposed method is better than feature discriminative linear regression (fDLR), an existing DNN ada...
Abstract—In acoustic modeling, speaker adaptive training (SAT) has been a long-standing technique fo...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Spoken human–machine interaction in real-world environments requires acoustic models that are robust...
Abstract We propose a feature space maximum a posteriori (MAP) linear regression framework to adapt ...
This paper investigates the use of parameterised sigmoid and rectified linear unit (ReLU) hidden act...
International audienceIn this paper we investigate GMM-derived features recentlyintroduced for adapt...
This paper explores new techniques that are based on a hidden‐layer linear transformation for fast s...
The problem of speaker and channel adaptation in deep neural network (DNN) based automatic speech re...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Recently, context-dependent (CD) deep neural network (DNN) hidden Markov models (HMMs) have been wid...
International audienceA technique is proposed for the adaptation of automatic speech recognition sys...
Recent progress in acoustic modeling with deep neural network has significantly improved the perform...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
A technique is proposed for the adaptation of automatic speech recognition systems using Hybrid mode...
Abstract—In acoustic modeling, speaker adaptive training (SAT) has been a long-standing technique fo...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Spoken human–machine interaction in real-world environments requires acoustic models that are robust...
Abstract We propose a feature space maximum a posteriori (MAP) linear regression framework to adapt ...
This paper investigates the use of parameterised sigmoid and rectified linear unit (ReLU) hidden act...
International audienceIn this paper we investigate GMM-derived features recentlyintroduced for adapt...
This paper explores new techniques that are based on a hidden‐layer linear transformation for fast s...
The problem of speaker and channel adaptation in deep neural network (DNN) based automatic speech re...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Recently, context-dependent (CD) deep neural network (DNN) hidden Markov models (HMMs) have been wid...
International audienceA technique is proposed for the adaptation of automatic speech recognition sys...
Recent progress in acoustic modeling with deep neural network has significantly improved the perform...
We investigate two strategies to improve the context-dependent deep neural network hidden Markov mod...
A technique is proposed for the adaptation of automatic speech recognition systems using Hybrid mode...
Abstract—In acoustic modeling, speaker adaptive training (SAT) has been a long-standing technique fo...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Spoken human–machine interaction in real-world environments requires acoustic models that are robust...