Tandem systems transform the cepstral features into posterior probabilities of subword units using artificial neural networks (ANNs), which are processed to form input features for conventional speech recognition systems. They have been shown to perform better than conventional speech recognition systems using cepstral features. Recent studies have shown that modelling cepstral features with auxiliary sources of knowledge leads to improvement in the performance of speech recognition systems. In this paper, we study two approaches to incorporate auxiliary knowledge sources such as pitch frequency, short-term energy, etc. (referred to as auxiliary features), in a tandem-based automatic speech recognition system. In the first approach, we mode...
Conventional features in automatic recognition of speech describe instantaneous shape of a short-tim...
In recent years there has been significant interest in Automatic Speech Recognition (ASR) and KeyWor...
The problem we address in this paper is, whether the feature extraction module trained on large amou...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discrimin...
Standard hidden Markov model (HMM) based automatic speech recognition (ASR) systems usually use ceps...
Standard hidden Markov model (HMM) based automatic speech recognition (ASR) systems usually use ceps...
In tandem acoustic modeling, signal features are first processed by a discriminantly-trained neural ...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Automatic speech recognition (ASR) is a very challenging problem due to the wide variety of the data...
Automatic speech recognition requires many hours of transcribed speech recordings in order for an a...
Tandem acoustic modeling consists of taking the outputs of a neural network discriminantly trained t...
In recent years, the features derived from posteriors of a multilayer perceptron (MLP), known as tan...
We present a method for training bottleneck MLPs for use in tandem ASR. Experiments on meetings data...
Conventional features in automatic recognition of speech describe instantaneous shape of a short-tim...
In recent years there has been significant interest in Automatic Speech Recognition (ASR) and KeyWor...
The problem we address in this paper is, whether the feature extraction module trained on large amou...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
Tandem systems transform the cepstral features into posterior probabilities of subword units using a...
In the tandem approach to modeling the acoustic signal, a neural-net preprocessor is first discrimin...
Standard hidden Markov model (HMM) based automatic speech recognition (ASR) systems usually use ceps...
Standard hidden Markov model (HMM) based automatic speech recognition (ASR) systems usually use ceps...
In tandem acoustic modeling, signal features are first processed by a discriminantly-trained neural ...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Automatic speech recognition (ASR) is a very challenging problem due to the wide variety of the data...
Automatic speech recognition requires many hours of transcribed speech recordings in order for an a...
Tandem acoustic modeling consists of taking the outputs of a neural network discriminantly trained t...
In recent years, the features derived from posteriors of a multilayer perceptron (MLP), known as tan...
We present a method for training bottleneck MLPs for use in tandem ASR. Experiments on meetings data...
Conventional features in automatic recognition of speech describe instantaneous shape of a short-tim...
In recent years there has been significant interest in Automatic Speech Recognition (ASR) and KeyWor...
The problem we address in this paper is, whether the feature extraction module trained on large amou...