Speaker adaptive training (SAT) is a well studied technique for Gaussian mixture acoustic models (GMMs). Recently we proposed to perform SAT for deep neural networks (DNNs), with speaker i-vectors applied in feature learning. The resulting SAT-DNN models significantly outperform DNNs on word error rates (WERs). In this paper, we present different methods to further improve and extend SAT-DNN. First, we conduct detailed analysis to investigate i-vector extractor training and flexible feature fusion. Second, the SAT-DNN approach is extended to improve tasks including bottleneck feature (BNF) generation, convolutional neural network (CNN) acoustic modeling and multilingual DNN-based feature extraction. Third, for transcribing multimedia data, ...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
In the paper we present two techniques improving the recognition accuracy of multilayer perceptron n...
Speaker adaptive training (SAT) is a well studied technique for Gaussian mixture acoustic models (GM...
Abstract—In acoustic modeling, speaker adaptive training (SAT) has been a long-standing technique fo...
<p>We investigate the concept of speaker adaptive training (SAT) in the context of deep neural netwo...
We investigate the concept of speaker adaptive training (SAT) in the context of deep neural network ...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One o...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Deep Neural Network (DNN) has become a standard method in many ASR tasks. Recently there is consider...
Rapid adaptation of deep neural networks (DNNs) with limited unsupervised data remains a significant...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for...
Automatic speech recognition (ASR) is a key core technology for the information age. ASR systems hav...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
In the paper we present two techniques improving the recognition accuracy of multilayer perceptron n...
Speaker adaptive training (SAT) is a well studied technique for Gaussian mixture acoustic models (GM...
Abstract—In acoustic modeling, speaker adaptive training (SAT) has been a long-standing technique fo...
<p>We investigate the concept of speaker adaptive training (SAT) in the context of deep neural netwo...
We investigate the concept of speaker adaptive training (SAT) in the context of deep neural network ...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One o...
The introduction of deep neural networks (DNNs) has advanced the performance of automatic speech rec...
Deep Neural Network (DNN) has become a standard method in many ASR tasks. Recently there is consider...
Rapid adaptation of deep neural networks (DNNs) with limited unsupervised data remains a significant...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for...
Automatic speech recognition (ASR) is a key core technology for the information age. ASR systems hav...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
Adaptation to speaker variations is an essential component of speech recognition systems. One common...
In the paper we present two techniques improving the recognition accuracy of multilayer perceptron n...