In a recent work, the framework of Boosted Binary Features (BBF) was proposed for ASR. In this framework, a small set of localized binary-valued features are selected using the Dis-crete Adaboost algorithm. These features are then integrated into a standard HMM-based system using either single layer perceptrons (SLP) or multilayer perceptrons (MLP). The fea-tures were found to perform significantly better (when cou-pled with SLP) and equally well (when coupled with MLP) compared to MFCC features on the TIMIT phoneme recogni-tion task. The current work presents an overview of the idea and extends it in two directions: 1) fusion of BBF with MFCC and an analysis of their complementarity, 2) scalability of the proposed features from phoneme rec...
Boosting is a general method for training an ensemble of classifiers with a view to improving perfor...
This thesis presents a method to investigate the extent to which articulatory based acoustic feature...
Automatic speech recognition (ASR) decodes speech signals into text. While ASR can produce accurate ...
In this paper, we propose a novel parts-based binary-valued feature for ASR. This feature is extract...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, ...
One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP)...
Maintaining a high level of robustness for Automatic Speech Recognition (ASR) systems is especially ...
The results of investigations into some aspects of robust speech recognition are reported in this th...
State-of-the-art automatic speech recognition (ASR) systems are significantly inferior to humans esp...
The paper presents a work-in-progress on several emerging concepts in Automatic Speech Recognition (...
The performance of automatic speech recognition (ASR) system can be significantly enhanced with addi...
Speech is the most efficient way to train a machine or communicate with a machine. This work focuses...
Statistical data-driven methods and knowledge-based methods are two recent trends in Automatic Speec...
Proceedings of Interspeech-Eurospeech 2005, Lisbon (Portugal)In this paper we address the problem of...
International audienceHeterogeneous knowledge sources that model speech only at certain time frames ...
Boosting is a general method for training an ensemble of classifiers with a view to improving perfor...
This thesis presents a method to investigate the extent to which articulatory based acoustic feature...
Automatic speech recognition (ASR) decodes speech signals into text. While ASR can produce accurate ...
In this paper, we propose a novel parts-based binary-valued feature for ASR. This feature is extract...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, ...
One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP)...
Maintaining a high level of robustness for Automatic Speech Recognition (ASR) systems is especially ...
The results of investigations into some aspects of robust speech recognition are reported in this th...
State-of-the-art automatic speech recognition (ASR) systems are significantly inferior to humans esp...
The paper presents a work-in-progress on several emerging concepts in Automatic Speech Recognition (...
The performance of automatic speech recognition (ASR) system can be significantly enhanced with addi...
Speech is the most efficient way to train a machine or communicate with a machine. This work focuses...
Statistical data-driven methods and knowledge-based methods are two recent trends in Automatic Speec...
Proceedings of Interspeech-Eurospeech 2005, Lisbon (Portugal)In this paper we address the problem of...
International audienceHeterogeneous knowledge sources that model speech only at certain time frames ...
Boosting is a general method for training an ensemble of classifiers with a view to improving perfor...
This thesis presents a method to investigate the extent to which articulatory based acoustic feature...
Automatic speech recognition (ASR) decodes speech signals into text. While ASR can produce accurate ...